Removing Duplicate Page Content
-
Since joining SEOMOZ four weeks ago I've been busy tweaking our site, a magento eCommerce store, and have successfully removed a significant portion of the errors.
Now I need to remove/hide duplicate pages from the search engines and I'm wondering what is the best way to attack this?
Can I solve this in one central location, or do I need to do something in the Google & Bing webmaster tools?
Here is a list of duplicate content
http://www.unitedbmwonline.com/?dir=asc&mode=grid&order=name http://www.unitedbmwonline.com/?dir=asc&mode=list&order=name
http://www.unitedbmwonline.com/?dir=asc&order=name http://www.unitedbmwonline.com/?dir=desc&mode=grid&order=name http://www.unitedbmwonline.com/?dir=desc&mode=list&order=name http://www.unitedbmwonline.com/?dir=desc&order=name http://www.unitedbmwonline.com/?mode=grid http://www.unitedbmwonline.com/?mode=listThanks in advance,
Steve
-
Thank you Cyrus I will certainly read the blog post and consider the noindex, nofollow on content with a canonical tag that differs from the current served page' uri.
I am still at little confused as to why the SEOMOZ crawl is highlighting duplicate pages when the canonical tag is present and pointing to the primary content.
Take the following example page for example:-
http://www.planksclothing.com/planks-classic-t-shirt-black-multi.html
Firstly the page has a canonical tag. There is no search on the site and product is viewed a root level without directory structure, which in a Magento instance is the common problem with duplicate content...
Currently at the time of writing SEOMOZ is updating my duplicate repor, so I can't find out what is the duplicate content. Maybe it is updating to say it is not
Thanks
Amendment: After reading the supplied blog post (http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world) I have learn't that the above page is just not different and probably is in the area of "Thin Content".
-
There are many, many different types of duplicate content, and how you handle it depends on the specific type of duplicate content and your needs.
If you haven't already, I highly suggest you read Dr. Pete's excellent post on dupe content here: http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
In your specific case it looks like you have multiple parameters serving the same basic content as your homepage. Is this correct?
In this case, you should set a canonical on every page pointing to the homepage. This also has the benefit of solving the errors in the SEOmoz PRO app.
It also sounds like you've addressed the issue in Google's Webmaster Tools. Unfortunately, Google doesn't let SEOmoz sync with Webmaster Tools, so anything you set there won't show up in the Web App.
Finally, don't forget about Bing Webmaster. They have similar parameter settings you can submit.
By the way, some SEOs would suggest putting meta robots "NOINDEX, FOLLOW" tags on those duplicate pages. While this may potentially send conflicting signals when coupled with the canonical tag, it is a potentially valid approach.
Hope this helps! Best of luck with your SEO.
-
This is exactly my current situation...
As a result of the SEOMOZ Duplicate content report I set about resolving these issues...
In the first instance I configured URL parameters via Google Webmaster Tools. It instantly occurred to me that whilst this fixes these potential duplicate content in Google this configuration does not affect other search engines and the work is unlikely to be reflected in future SEOMOZ crawls of the site.
I'm interested in creating a over arching method of removing the potential duplication caused via URL parameters required to paginate, sort and filter content. The majority of these URL parameters are standardized across web applications. But is it actually required?
In my case each Magento store uses the canonical tag correctly and has an updated robots.txt to restrict the crawling of areas of the site that should be excluded... In a sense this is the over arching method of removing potential duplicate content. So why is SEOMOZ reporting duplicate content?
I suppose the big question is... Is SEOMOZ crawling the site correctly, do these results reflect robots.txt and canonical tags?
-
Thank you for your thoughts.
As mentioned in my above response, canonical tags have already been configured for the site, it's just this home page that remains the issue.
-
Thanks for your response.
I looked in URL Parameters and see dir & mode are already defined.
Then I searched the http://www.unitedbmwonline.com page source for canonical links and none are defined, though I do have canonical tags setup for the rest of the site
Any other thoughts of how to remove these duplicates?
-
You can also tell Google to ignore certain query string variables through Webmaster Tools.
For instance, indicate that "dir" and "mode" have no impact on content.
Other SE's have simular controls.
-
This is why the canonical tag was invented, to solve duplicate content issues when URL parameters are involved. Set a canonical tag on all these pages to point towards the version of the page you want to appear in search results. As long as the pages are identical, or close to it, the search engines (most likely) will respect the canonical tag, and pass along the duplicate versions link juice to the page you're pointing to.
Here's some info: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html. If you Google "canonical tag", you'll find lots more!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search console, duplicate content and Moz
Hi, Working on a site that has duplicate content in the following manner: http://domain.com/content
Intermediate & Advanced SEO | | paulneuteboom
http://www.domain.com/content Question: would telling search console to treat one of them as the primary site also stop Moz from seeing this as duplicate content? Thanks in advance, Best, Paul. http0 -
On Page Content. has a H2 Tag but should I also use H3 tags for the sub headings within this body of content
Hi Mozzers, My on page content comes under my H2 tag. I have a few subheadings within my content to help break it up etc and currently this is just underlined (not bold or anything) and I am wondering from an SEO perspective, should I be making these sub headings H3 tags. Otherwise , I just have 500-750 words of content under an H2 tag which is what I am currently doing on my landing pages. thanks pete
Intermediate & Advanced SEO | | PeteC120 -
Noindexing Duplicate (non-unique) Content
When "noindex" is added to a page, does this ensure Google does not count page as part of their analysis of unique vs duplicate content ratio on a website? Example: I have a real estate business and I have noindex on MLS pages. However, is there a chance that even though Google does not index these pages, Google will still see those pages and think "ah, these are duplicate MLS pages, we are going to let those pages drag down value of entire site and lower ranking of even the unique pages". I like to just use "noindex, follow" on those MLS pages, but would it be safer to add pages to robots.txt as well and that should - in theory - increase likelihood Google will not see such MLS pages as duplicate content on my website? On another note: I had these MLS pages indexed and 3-4 weeks ago added "noindex, follow". However, still all indexed and no signs Google is noindexing yet.....
Intermediate & Advanced SEO | | khi50 -
URL Parameters Duplicate Page Title
Thanks in advance, I'm getting duplicate page titles because seomoz keeps crawling through my url parameters. I added forcefiltersupdate to the URL parameters in webmaster tools but it has not seemed to have an effect. Below is an example of the duplicate content issue that I am having. http://qlineshop.com/OC/index.php?route=product/category&path=59_62&forcefiltersupdate=true&checkedfilters[]=a.13.13.387baf0199e7c9cc944fae94e96448fa Any thoughts? Thanks again. -Patrick
Intermediate & Advanced SEO | | bamron0 -
How to prevent duplicate content within this complex website?
I have a complex SEO issue I've been wrestling with and I'd appreciate your views on this very much. I have a sports website and most visitors are looking for the games that are played in the current week (I've studied this - it's true). We're creating a new website from scratch and I want to do this is as best as possible. We want to use the most elegant and best way to do this. We do not want to use work-arounds such as iframes, hiding text using AJAX etc. We need a solid solution for both users and search engines. Therefor I have written down three options: Using a canonical URL; Using 301-redirects; Using 302-redirects. Introduction The page 'website.com/competition/season/week-8' shows the soccer games that are played in game week 8 of the season. The next week users are interested in the games that are played in that week (game week 9). So the content a visitor is interested in, is constantly shifting because of the way competitions and tournaments are organized. After a season the same goes for the season of course. The website we're building has the following structure: Competition (e.g. 'premier league') Season (e.g. '2011-2012') Playweek (e.g. 'week 8') Game (e.g. 'Manchester United - Arsenal') This is the most logical structure one can think of. This is what users expect. Now we're facing the following challenge: when a user goes to http://website.com/premier-league he expects to see a) the games that are played in the current week and b) the current standings. When someone goes to http://website.com/premier-league/2011-2012/ he expects to see the same: the games that are played in the current week and the current standings. When someone goes to http://website.com/premier-league/2011-2012/week-8/ he expects to the same: the games that are played in the current week and the current standings. So essentially there's three places, within every active season within a competition, within the website where logically the same information has to be shown. To deal with this from a UX and SEO perspective, we have the following options: Option A - Use a canonical URL Using a canonical URL could solve this problem. You could use a canonical URL from the current week page and the Season page to the competition page: So: the page on 'website.com/$competition/$season/playweek-8' would have a canonical tag that points to 'website.com/$competition/' the page on 'website.com/$competition/$season/' would have a canonical tag that points to 'website.com/$competition/' The next week however, you want to have the canonical tag on 'website.com/$competition/$season/playweek-9' and the canonical tag from 'website.com/$competition/$season/playweek-8' should be removed. So then you have: the page on 'website.com/$competition/$season/playweek-9' would have a canonical tag that points to 'website.com/$competition/' the page on 'website.com/$competition/$season/' would still have a canonical tag that points to 'website.com/$competition/' In essence the canonical tag is constantly traveling through the pages. Advantages: UX: for a user this is a very neat solution. Wherever a user goes, he sees the information he expects. So that's all good. SEO: the search engines get very clear guidelines as to how the website functions and we prevent duplicate content. Disavantages: I have some concerns regarding the weekly changing canonical tag from a SEO perspective. Every week, within every competition the canonical tags are updated. How often do Search Engines update their index for canonical tags? I mean, say it takes a Search Engine a week to visit a page, crawl a page and process a canonical tag correctly, then the Search Engines will be a week behind on figuring out the actual structure of the hierarchy. On top of that: what do the changing canonical URLs to the 'quality' of the website? In theory this should be working all but I have some reservations on this. If there is a canonical tag from 'website.com/$competition/$season/week-8', what does this do to the indexation and ranking of it's subpages (the actual match pages) Option B - Using 301-redirects Using 301-redirects essentially the user and the Search Engine are treated the same. When the Season page or competition page are requested both are redirected to game week page. The same applies here as applies for the canonical URL: every week there are changes in the redirects. So in game week 8: the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-8' the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-8' A week goes by, so then you have: the page on 'website.com/$competition/' would have a 301-redirect that points to 'website.com/$competition/$season/week-9' the page on 'website.com/$competition/$season' would have a 301-redirect that points to 'website.com/$competition/$season/week-9' Advantages There is no loss of link authority. Disadvantages Before a playweek starts the playweek in question can be indexed. However, in the current playweek the playweek page 301-redirects to the competition page. After that week the page's 301-redirect is removed again and it's indexable. What do all the (changing) 301-redirects do to the overall quality of the website for Search Engines (and users)? Option C - Using 302-redirects Most SEO's will refrain from using 302-redirects. However, 302-redirect can be put to good use: for serving a temporary redirect. Within my website there's the content that's most important to the users (and therefor search engines) is constantly moving. In most cases after a week a different piece of the website is most interesting for a user. So let's take our example above. We're in playweek 8. If you want 'website.com/$competition/' to be redirecting to 'website.com/$competition/$season/week-8/' you can use a 302-redirect. Because the redirect is temporary The next week the 302-redirect on 'website.com/$competition/' will be adjusted. It'll be pointing to 'website.com/$competition/$season/week-9'. Advantages We're putting the 302-redirect to its actual use. The pages that 302-redirect (for instance 'website.com/$competition' and 'website.com/$competition/$season') will remain indexed. Disadvantages Not quite sure how Google will handle this, they're not very clear on how they exactly handle a 302-redirect and in which cases a 302-redirect might be useful. In most cases they advise webmasters not to use it. I'd very much like your opinion on this. Thanks in advance guys and galls!
Intermediate & Advanced SEO | | StevenvanVessum0 -
Pages with Little Content
I have a website that lists events in Dublin, Ireland. I want to provide a comprehensive number of listings but there are not enough hours in the day to provide a detailed (or even short) unique description for every event. At the moment I have some pages with little detail other than the event title and venue. Should I try and prevent Google from crawling/indexing these pages for fear of reducing the overall ranking of the site? At the moment I only link to these pages via the RSS feed. I could remove the pages entirely from my feed, but then that mean I remove information that might be useful to people following the events feed. Here is an example page with very little content
Intermediate & Advanced SEO | | andywozhere0 -
Duplicate page content and duplicate pate title
Hi, i am running a global concept that operates with one webpage that has lot of content, the content is also available on different domains, but with in the same concept. I think i am getting bad ranking due to duplicate content, since some of the content is mirrored from the main page to the other "support pages" and they are almost 200 in total. Can i do some changes to work around this or am i just screwed 🙂
Intermediate & Advanced SEO | | smartmedia0 -
Duplicate Content On A Subdomain
Hi, We have a client who is currently close to completing a site specifically aimed at the UK market (they're doing this in-house so we've had no say in how it will work). The site will almost be a duplicate (in terms of content, targeted keywords etc.) of a section of the main site (that sits on the root domain) - the main site is targeted toward the US. The only difference will be certain spellings and currency type. If this new UK site were to sit on a sub domain of the main site, which is a .com, will this cause duplicate content issues? I know that there wouldn't be an issue if the new site were to be on a separate .co.uk domain (according to Matt Cutts), but it looks like the client wants it to be on a sub domain. Any help/advice would be greatly appreciated.
Intermediate & Advanced SEO | | jasarrow0