Site scraped over 400,000 urls
-
Our business is heavily dependent on SEO traffic from long tail search. We have over 400,000 pieces of content, all of which we found scraped and published by another site based out of Hong Kong (we're in the US).
Google has a process for DMCA takedown, but doing so would be beyond tedious for such a large set of urls. The scraped content is outranking us in many searches and we've noticed a drastic decrease in organic traffic, likely from a duplicate content penalty.
Has anyone dealt with an issue like this? I can't seem to find much help online.
-
Hi Kibin
Firstly it's unlikely that their scraped content will affect your rankings - Google generally knows who originated it. However:
Do you have the hreflang tag on your website? specifying your language and location? If theirs has this as well then technically you are targetting a different country, so there should be no duplicate content if you added it.
https://support.google.com/webmasters/answer/189077?hl=en
I would tell Google about the URL and add a sample 10 URLs first: https://www.google.com/webmasters/tools/dmca-dashboard. Telling them is an absolute must even if it's only a few URLs.
Also email the hosting company informing them that they are hosting copied content and that the penalties are severe.
Finally, write to the company themselves and tell them/warn them that you are going legal and send them a cease and desist legal letter. I am sure you can knock one up for a few dollars from a friendly solicitor.
Watch this: https://www.youtube.com/watch?v=gGc_jc3Oznk It's a bit long but worth it.
Do all of these things.
Regards
Nigel
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Clean URL vs. Parameter URL and Using Canonical URL...That's a Mouthfull!
Hi Everyone, I a currently migrating a Magento site over to Shopify Plus and have a question about best practices for using the canonical URL. There is a competitor that I believe is not doing it the correct way, so I want to make sure my way is the better choice. With 'Vendor Pages' in Shopify, they show up looking like: https://www.campusprotein.com/collections/vendors?q=Cellucor. Not as clean. Problem is that Shopify also creates https://www.campusprotein.com/collections/cellucor. Same products, same page, just a different more clean URL. I am seeing both indexed in Google. What I want to do is basically create a canonical URL from the URL with the parameter that points to the clean URL. The two pages are very similar. The only difference is that the clean URL page has some additional content at the top of the page. I would say the two pages are 90% the same. Do you see any issue with that?
Technical SEO | | vetofunk0 -
Moving site from html to Wordpress site: Should I port all old pages and redirect?
Any help would be appreciated. I am porting an old legacy .html site, which has about 500,000 visitors/month and over 10,000 pages to a new custom Wordpress site with a responsive design (long overdue, of course) that has been written and only needs a few finishing touches, and which includes many database features to generate new pages that did not previously exist. My questions are: Should I bother to port over older pages that are "thin" and have no incoming links, such that reworking them would take time away from the need to port quickly? I will be restructuring the legacy URLs to be lean and clean, so 301 redirects will be necessary. I know that there will be link juice loss, but how long does it usually take for the redirects to "take hold?" I will be moving to https at the same time to avoid yet another porting issue. Many thanks for any advice and opinions as I embark on this massive data entry project.
Technical SEO | | gheh20130 -
Site-wide Links
Hey y'all, I know this question has been asked many times before but I wanted to see what your stance was on this particular case. The organisation I work for is a group of 12 companies - each with its own website. On some of the sites we have a link to the other sites within the group on every single page of that site. Our organic search traffic has dropped a bit but not significantly and we haven't received any manual penalties from Google. It's also worth mentioning that the referral traffic for these sites from the other sites I control is quite good and the bounce rate is extremely low. If you were in my shoes would you remove the links, put a nofollow tag on the links or leave the links as they are? Thanks guys 🙂
Technical SEO | | AAttias0 -
New Site maintaining rank on old URL's
Hi I have a new website going live which has a different page names etc i.e. the old site had pages that are ranking called aboutus.html and the new site is called about.php What is the best approach to maintain the rank and also on orphaned pages Many Thanks
Technical SEO | | ocelot0 -
Updating content on URL or new URL
High Mozzers, We are an event organisation. Every year we produce like 350 events. All the events are on our website. A lot of these events are held every year. So i have an URL like www.domainname.nl/eventname So what would you do. This URL has some inbound links, some social mentions and so on. SO if the event will be held again in 2013. Would it be better to update the content on this URL or create a new one. I would keep this URL and update it because of the linkvalue and it is allready indexed and ranking for the desired keyword for that event. Cheers, Ruud
Technical SEO | | RuudHeijnen0 -
How to do a no follow on site search
We have a site search that is causing a huge amount of errors as the SEOmoz crawler is showing these as duplicate content. Our first thought was to do a no-follow on the site-search directory, but we realized that the site search is /site-search.aspx and URl strings appear at the end for hundreds of pages. How dow we/how can we no-follow an undetermined amount of URL strings?
Technical SEO | | Apptixweb0 -
301 an old URL with a ? in the URL?
I am redoing a site and the URL's are changing structure. The client's site was in magento and in the store they would get two URLs, for example: /store/categoryname/productname and /store/categoryname/productname?SID=dslkajsfdoiu947598whouieht983hg98 Do I have to 301 redirect both of these URL's to their new counterpart? Both go to the same content but magento seemed to add these SIDs into the navigation and Google has both versions in the index.
Technical SEO | | DanDeceuster0