Duplicate Content due to Panda update!
-
I can see that a lot of you are worrying about this new Panda update just as I am!
I have such a headache trying to figure this one out, can any of you help me?
I have thousands of pages that are "duplicate content" which I just can't for the life of me see how... take these two for example:
http://www.eteach.com/Employer.aspx?EmpNo=18753
http://www.eteach.com/Employer.aspx?EmpNo=31241
My campaign crawler is telling me these are duplicate content pages because of the same title (which that I can see) and because of the content (which I can't see).
Can anyone see how Google is interpreting these two pages as duplicate content??
Stupid Panda!
-
Hi Virginia
This is frustrating indeed as it certainly doesn't look like you've used duplicate content in a malicious way.
To understand why Google might be seeing these pages as duplicate content, let's take a look at the pages through the Google bot's eyes:
Google Crawl for page 1
Google Crawl for page 2What you'll see here is that Google is reading the entirety of both pages, with the only difference being a logo that it can't see and a name + postal address. The rest of the page is duplicate. This should point out that Google reads things like site navigation menus and footers and interprets them, for the purpose of Panda, as "content".
This doesn't mean that you should have a different navigation on every page (that wouldn't be feasible). But it does mean that you need to have enough unique content on each page to show Google that the pages are not duplicate and contain content. I can't give you a % on this, but let's say roughly content that is 300-400 words long would do the trick.
Now, this might be feasible for some of your pages, but for the two pages you've linked to above, there simply isn't enough you could write about. Similarly, because the URL generates a random query for each employer, you could potentially have hundreds or thousands of pages you'd need to add content to, which is a hell of a lot of work.
So here's what I'd do. I'd get a list of each URL on your site that could be seen as "duplicate" content, like the ones above. Be as harsh in judging this as Google would be. I'd then decide whether you can add further content to these pages or not. For description pages or "about us" pages, you can perhaps add a bit more. For URLs like the ones above, you should do the following:
In the header of each of these URLs you've identified, add this code:
This tells the Googlebot not to crawl or index the URLs. In doing that, it won't rank it in the index and it won't see it as duplicate content. This would be perfect for the URLs you've given above as I very much doubt you'd ever want to rank these pages, so you can safely noindex and nofollow them. Furthermore, as these URLs are created from queries, I am assuming that you may have one "master" page that the URLs are generated from. This may mean that you would only need to add the meta code to this one page for it to apply to all of them. I'm not certain on this and you should clarify with your developers and/or whoever runs your CMS. The important thing, however, is to have the meta tags applied to all those duplicate content URLs that you don't want to rank for. For those that you do want to rank for, you will need to add more unique content to those pages in order to stop it being flagged as duplicate.
As always, there's a great Moz post on how to deal with duplication issues right here.
Hope this helps Virginia and if you have any more questions, feel free to ask me!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any more info on potential Google algo update from April 24th/25th?
Apart from an article on Search Engine Roundtable, I haven’t been able to find anything out about the potential algorithm update that happened on Monday / Tuesday of this week. One of our sites (finance niche) saw drops in rankings for bad credit terms on Tuesday, followed by total collapse on Wednesday and Thursday. We had made some changes the previous week to the bad credit section of the site, but the curious thing here is that rankings for bad credit terms all over the site (not just the changed section) disappeared. Has anyone else seen the impact of this change, and are there any working theories on what caused it? I’m even wondering whether a specific change has been made for bad credit terms (i.e. the payday loan update)?
White Hat / Black Hat SEO | | thatkinson0 -
Is it a duplicate content ?
Hi Please check this link : http : // www . speedguide . net/news/yahoo-acquires-email-management-app-xobni-5252 it's a post where the admin just write the first 200-300 words and then insert the "read more here" which links to the original post This make the website active as the admin always add new content but is this not against google rules as it's a duplicate content ?? Can you tell me the name of this strategy ? Is this really work to make the website active ??
White Hat / Black Hat SEO | | loumi0 -
Does the SEOmoz Suggested Directory List Need to be Updated?
So, since Google updated their link schemes page (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66356) with avoid using "Low-quality directories", I've been thinking a lot about what makes a directory "low-quality". Obviously, this is important, or Google wouldn't have mentioned it. I was wondering if someone could explain to me how some of the directories suggested by SEOmoz at http://www.seomoz.org/directories are NOT low-quality, specifically some of the ones marked "General". The page lists stuff like busybits.com, for instance. One that I guess many are aware of, and yea it has a high home page PageRank, and it's got some history, and it's human-edited, ok great. But does it actually add any value to anyone that's not just looking to get a link? A page like http://busybits.com/Business/Others/2/ having (dofollow) listings like "Phone cards, Calling cards" "Insurance in Canada" .... ect. It just looks like an SEO backlink hub. No value at all to a user trying to discover new sites/content. Anyway, back to my main question, how is something like this NOT "low-quality"? Thank you
White Hat / Black Hat SEO | | MadeLoud4 -
How to Not Scrap Content, but still Being a Hub
Hello Seomoz members. I'm relatively new to SEO, so please forgive me if my questions are a little basic. One of the sites I manage is GoldSilver.com. We sell gold and silver coins and bars, but we also have a very important news aspect to our site. For about 2-3 years now we have been a major hub as a gold and silver news aggregator. At 1.5 years ago (before we knew much about SEO), we switched from linking to the original news site to scraping their content and putting it on our site. The chief reason for this was users would click outbound to read an article, see an ad for a competitor, then buy elsewhere. We were trying to avoid this (a relatively stupid decision with hindsight). We have realized that the Search Engines are penalizing us, which I don't blame them for, for having this scraped content on our site. So I'm trying to figure out how to move forward from here. We would like to remain a hub for news related to Gold and Silver and not be penalized by SEs, but we also need to sell bullion and would like to avoid loosing clients to competitors through ads on the news articles. One of the solutions we are thinking about is perhaps using an iFrame to display the original url, but within our experience. An example is how trap.it does this (see attached picture). This way we can still control the experience some what, but are still remaining a hub. Thoughts? Thank you, nick 3dLVv
White Hat / Black Hat SEO | | nwright0 -
Shadow Pages for Flash Content
Hello. I am curious to better understand what I've been told are "shadow pages" for Flash experiences. So for example, go here:
White Hat / Black Hat SEO | | mozcrush
http://instoresnow.walmart.com/Kraft.aspx#/home View the page as Googlebot and you'll see an HTML page. It is completely different than the Flash page. 1. Is this ok?
2. If I make my shadow page mirror the Flash page, can I put links in it that lead the user to the same places that the Flash experience does?
3. Can I put "Pinterest" Pin-able images in my shadow page?
3. Can a create a shadow page for a video that has the transcript in it? Is this the same as closed captioning? Thanks so much in advance, -GoogleCrush0 -
Why doesn't Google find different domains - same content?
I have been slowly working to remove near duplicate content from my own website for different locals. Google seems to be doing noting to combat the duplicate content of one of my competitors showing up all over southern California. For Example: Your Local #1 Rancho Bernardo Pest Control Experts | 858-352 ... <cite>www.pestcontrolranchobernardo.com/</cite>CachedYou +1'd this publicly. UndoPest Control Rancho Bernardo Pros specializes in the eradication of all household pests including ants, roaches, etc. Call Today @ 858-352-7728. Your Local #1 Oceanside Pest Control Experts | 760-486-2807 ... <cite>www.pestcontrol-oceanside.info/</cite>CachedYou +1'd this publicly. UndoPest Control Oceanside Pros specializes in the eradication of all household pests including ants, roaches, etc. Call Today @ 760-486-2807. The competitor is getting high page 1 listing for massively duplicated content across web domains. Will Google find this black hat workmanship? Meanwhile, he's sucking up my business. Do the results of the competitor's success also speak to the possibility that Google does in fact rank based on the name of the url - something that gets debated all the time? Thanks for your insights. Gerry
White Hat / Black Hat SEO | | GerryWeitz0 -
Multiple doamin with same content?
I have multiple websites with same content such as http://www.example.com http://www.example.org and so on. My primary url is http://www.infoniagara.com and I also placed a 301 on .org. Is that enough to keep away my exampl.org site from indexing on google and other search engines? the eaxmple.org also has lots of link to my old html pages (now removed). Should i change that links too? or will 301 redirection solve all such issues (page not found/crawl error) of my old webpages? i would welcome good seo practices regarding maintaining multiple domains thanks and regards
White Hat / Black Hat SEO | | VipinLouka780 -
Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
Hi All, In relation to this thread http://www.seomoz.org/q/what-happend-to-my-ranks-began-dec-22-detailed-info-inside I'm still getting whipped hard from Google, this week for some reason all rankings have gone for the past few days. What I was wondering though is this, when Google says- Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations? I assume my site hits the nail on the head- [removed links at request of author] As you can see I target LG Optimus 3D Sim Free, LG Optimus 3D Contract and LG Optimus 3D Deals. Based on what Google has said, I know think there needs to be 1 page that covers it all instead of 3. What I'm wondering is the best way to deal with the situation? I think it should be something like this but please correct me along the way 🙂 1. Pick the strongest page out of the 3 2. Merge the content from the 2 weaker pages into the strongest 3. Update the title/meta info of the strongest page to include the KW variations of all 3 eg- LG Optimus 3D Contract Deals And Sim Free Pricing 4. Then scatter contract, deals and sim free throughout the text naturally 5. Then delete the weaker 2 pages and 301 redirect to the strongest page 6. Submit URL removal via webmastertools for the 2 weaker pages What would you do to correct this situation? Am I on the right track?
White Hat / Black Hat SEO | | mwoody0