How does Google decide what content is "similar" or "duplicate"?
-
Hello all,
I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this:
http://www.eteach.com/Employer.aspx?EmpNo=26626
http://www.eteach.com/Employer.aspx?EmpNo=36986
and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them.
But my question is...
If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages?
e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
Something like that...
Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
-
Hi Virginia,
Maybe this whiteboard Friday can help you out.
-
Hey Virginia
That is essentially what we call near duplicates and is the kind of content that can easily be created by pulling fields out of a database and dynamically creating the pages and dropping name, address etc into the placeholders.
Unique content is essentially that, unique content so this approach is probably not going to cut it. You could have certain elements pulled like this such as the address but you need to either remove these duplicate blocks and keep it more simple (like a business directory) and ideally add some unique elements to each page.
These kinds of pages often still rank for very specific queries and also often well thought out landing pages that link to pages like this that have value for users but are not search friendly can be a strategy.
So, assess how well these work as landing pages from search or are they coming in elsewhere? If they come in elsewhere you could no index these pages or block them in robots.txt. Then, target the bigger search terms higher up the tree and create good search landing pages that link to these other pages for users.
This is a real good read to get a better handle on duplicate content types and the relevant strategies:
http://moz.com/blog/fat-pandas-and-thin-content
Hope that helps
Marcus
-
Hi Virginia,
If you take your pages as a whole, code and all, the only slight difference in those pages is the
tag and the sidebar info with school address. The rest of the page code is exactly the same.
If you were to create 5 templates similar to:
[School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
If all you are doing is changing the [school name] ans [location] etc, I'm sure Google will still flag these pages as duplicate content.
Unique content is the best way. If theres not a lot of competition for the school name and the page has enough content about each individual school, head teacher etc, then "templates" might work. You can try it out but I'd say unique content is the best way. It's the nature of the beast with so many pages.
Hope this helps.
Robert
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Script must not be placed outside HTML tag? If not, how Google treats the page?
Hi, We have recently received the "deceptive content" warning from Google about some of our website pages. We couldn't able to find the exact reason behind this. However, we placed some script outside the HTML tag in some pages (Not in the same pages with the above warning). We wonder whether this caused an issue to Google to flag our pages. Please help. Thanks
White Hat / Black Hat SEO | | vtmoz0 -
"Fake" market research reports killing SEO
Our robotics company is in a fast growing, competitive market. There are an assortment of "market research" companies who are distributing press releases about their research reports (which are of less than dubious quality). These announcements end up being distributed through channels with high domain authority. The announcements mention many companies in the space that the purported report covers - including ours. As a result, our company name and product brand is suffering since the volume of press announcements is swamping our ratings. What would you do? Start writing blog postings on topics and post through inexpensive news feeds? Somehow contact the firms posting the contact and let them know they are in violation of our trademarks by mentioning our name? Other ideas?
White Hat / Black Hat SEO | | amelanson1 -
High ranking nationally but not locally via google
A website I am working on is ranked very well in all tracked keywords at a national level, but not from a local standpoint via google. I find it weird that the site is on the first page if you search from many other states/towns/locations but not locally. Looked on Google Search Console and couldn't see any link to why this is happening. Figured we would clear out the htaccess for any redirect issues and hope it fixes it. Suggestions please? Never seen google do this. It is strange.
White Hat / Black Hat SEO | | SeobyKP1 -
2015 Bing Disavow, should i copy and paste from Google?
So I just submitted my 2nd disavow file to Google, but what about Bing? I know i would have to submit one url at a time, but is it worth it? Is it safe yet to submit the same file from Google? I know Bing measures quantity of links and submitting the same file might hurt my rankings, but anything new in 2015?
White Hat / Black Hat SEO | | Shawn1240 -
Should we remove our "index" pages (alphabetical link list to all of the products on the site)?
We run an e-commerce site with a large number of product families, with each family having a number of products within it. We have a set of pages (26 - one for each letter A-Z) that are lists of links to the product family pages. We originally created these pages thinking it would aid in discoverability of these pages to search engines, of course as time has gone on, techniques like this have fallen out of favor with Google as it provides negligible value to the user. Should we consider removing these pages from the site overall? Is it possible that it could be viewed by Panda as resembling a link farm? Thanks in advance!
White Hat / Black Hat SEO | | ChrisRoberts-MTI1 -
Unnatural inbound links message from Google Webmaster Tools!
Hi Everyone, I just got this message from GWT(image below) This is probably a penguin Penalty. What is clear is I have to find the best and most efficient way to tackle this issue. We will probably lose tons of traffic in the next couple of weeks so I would like to get the best suggestions and maybe a guideline on how to do this in the most effective way! Thank you! 1a0X2M2a1h0A
White Hat / Black Hat SEO | | Ideas-Money-Art0 -
Got dropped on Google rank - Tips to discover why please
Hi guys originally my website was poor ranked on Google. So, after sign in on Moz and follow their tips I achieved the 4th position for one of my keywords (amazing!). But a few days ago my page dropped to bellow the first 50th pages for this same keyword, but I didn't make any changes on it. Anybody has some tips of how can I discover/repair what happened? Thank you all in advance. Best regards Paulo
White Hat / Black Hat SEO | | phlcastro0 -
Does Google Consider a Follow Affiliate Link into my site a paid link?
Let's say I have a link coming into my domain like this http://www.mydomain.com/l/freerol.aspx?AID=674&subid=Week+2+Freeroll&pid=120 Do you think Google recognizes this as paid link? These links are follow links. I am working on a site that has tons of these, but ranks fairly well. They did lose some ranking over the past month or so, and I am wondering if it might be related to a recent iteration of Penguin. These are very high PR inbound links and from a number of good domains, so I would not want to make a mistake and have client get affiliates to no follow if that is going to cause his rankings to drop more. Any thoughts would be appreciated.
White Hat / Black Hat SEO | | Robertnweil10