How does Google decide what content is "similar" or "duplicate"?
-
Hello all,
I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this:
http://www.eteach.com/Employer.aspx?EmpNo=26626
http://www.eteach.com/Employer.aspx?EmpNo=36986
and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them.
But my question is...
If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages?
e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
Something like that...
Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
-
Hi Virginia,
Maybe this whiteboard Friday can help you out.
-
Hey Virginia
That is essentially what we call near duplicates and is the kind of content that can easily be created by pulling fields out of a database and dynamically creating the pages and dropping name, address etc into the placeholders.
Unique content is essentially that, unique content so this approach is probably not going to cut it. You could have certain elements pulled like this such as the address but you need to either remove these duplicate blocks and keep it more simple (like a business directory) and ideally add some unique elements to each page.
These kinds of pages often still rank for very specific queries and also often well thought out landing pages that link to pages like this that have value for users but are not search friendly can be a strategy.
So, assess how well these work as landing pages from search or are they coming in elsewhere? If they come in elsewhere you could no index these pages or block them in robots.txt. Then, target the bigger search terms higher up the tree and create good search landing pages that link to these other pages for users.
This is a real good read to get a better handle on duplicate content types and the relevant strategies:
http://moz.com/blog/fat-pandas-and-thin-content
Hope that helps
Marcus
-
Hi Virginia,
If you take your pages as a whole, code and all, the only slight difference in those pages is the
tag and the sidebar info with school address. The rest of the page code is exactly the same.
If you were to create 5 templates similar to:
[School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
If all you are doing is changing the [school name] ans [location] etc, I'm sure Google will still flag these pages as duplicate content.
Unique content is the best way. If theres not a lot of competition for the school name and the page has enough content about each individual school, head teacher etc, then "templates" might work. You can try it out but I'd say unique content is the best way. It's the nature of the beast with so many pages.
Hope this helps.
Robert
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google spider
If someone provide 1 or more cent discount to our customers who put up a link on their site, and wanted to actually show the referral discount in their shopping cart for that customer, can Google see that and realize they are providing a discount for a link? Can Google see what's displayed in our their web application - like in the upload, shopping cart and complete transaction pages?
White Hat / Black Hat SEO | | K_Monestel0 -
Google Finance Filled with Spam
Not sure if anyone else does anything with Google Finance. In the last few months, I have been noticing a lot of spam sites filling the search results in Google "ticker pages". In this example you can see 4 or the 5 top results are from the same blog network with spun low quality content.
White Hat / Black Hat SEO | | SuperMikeLewis0 -
I have plenty of backlinks but the site does not seem to come up on Google`s first page.
My site has been jumping up and down for many months now. but it never stays on Google first page. I have plenty of back-links, shared content on social media. But what could i be doing wrong? any help will be appreciated. Content is legit. I have recently added some internal links is this might be the cause? Please help .
White Hat / Black Hat SEO | | samafaq0 -
Duplicate content for product pages
Say you have two separate pages, each featuring a different product. They have so many common features, that their content is virtually duplicated when you get to the bullets to break it all down. To avoid a penalty, is it advised to paraphrase? It seems to me it would benefit the user to see it all laid out the same, apples to apples. Thanks. I've considered combining the products on one page, but will be examining the data to see if there's a lost benefit to not having separate pages. Ditto for just not indexing the one that I suspect may not have much traction (requesting data to see).
White Hat / Black Hat SEO | | SSFCU0 -
Keyword Duplication in the title
Hello, I read on this great SEO Blueprint Article here that you don't want to duplicate any words in the title tag, even one duplicate. But what if your branding and keywords both have the same word in it. For example, making the title here like this: NLP Training and Certification Center | NLP and Coaching Institute which is 66 characters by the way. Your thoughts on the duplicate word "NLP"?
White Hat / Black Hat SEO | | BobGW0 -
Should I report this to Google and will anything happen ?
Hi, I am working with a client and have discovered that a direct competitor has hidden the clients business name in meta information and also hidden the name on the page but off to the side. My intention is to ask the company to remove the content, but the client would like me to report it to Google. Is this a waste of time and what request in webmaster tools should I use. The name is not a trademark but the business name is not generic and it is an obvious attempt to target my clients business. Any help would be appreciated, Thanks in advance
White Hat / Black Hat SEO | | Mozzi0 -
Oh sh@t Wetherby Racecourse has been de indexed by Google :-(
Dio mio! Wetherby racecourse <cite>www.wetherbyracing.co.uk/</cite> has been de indexed by Google, re indexing request has been made via webmaster tools and the offending 3rd party banner ad has been stripped out. So my question is please. How long will it take approximately to re -index?
White Hat / Black Hat SEO | | Nightwing
And is it true re submitting an updated xml site & firing tweets at the ailing site may spark it back into life? Grazie tanto,David0 -
Black Hat? Is it really possible my new client paid someone to SEO the word "here"?
I just took on a client and first thing I saw in Webmaster Tools was the dreaded "Unnatural Link Patterns" message dated Apr 7th, 2012. MajesticSEO is reporting 212 backlinks, OSE is reporting 251. Nothing out of the ordinary, in fact they only anchor text is their brand. However, we then ran an SEO PowerSuite Crawl and found 429 backlinks with 78.1% of links use the anchor text "here" and 77.9% of all links point to the same URL. If this is indeed true I can see why they got the message from Google. The company has admitted they hired a service to do SEO for $299/mo for several months but when they saw no results they quit. Could this company really have gone after "here". It not, I can't find anything that would give them the message they got from Google Webmaster Tools.
White Hat / Black Hat SEO | | Dweber0