Removing Duplicate Page Content
-
Since joining SEOMOZ four weeks ago I've been busy tweaking our site, a magento eCommerce store, and have successfully removed a significant portion of the errors.
Now I need to remove/hide duplicate pages from the search engines and I'm wondering what is the best way to attack this?
Can I solve this in one central location, or do I need to do something in the Google & Bing webmaster tools?
Here is a list of duplicate content
http://www.unitedbmwonline.com/?dir=asc&mode=grid&order=name http://www.unitedbmwonline.com/?dir=asc&mode=list&order=name
http://www.unitedbmwonline.com/?dir=asc&order=name http://www.unitedbmwonline.com/?dir=desc&mode=grid&order=name http://www.unitedbmwonline.com/?dir=desc&mode=list&order=name http://www.unitedbmwonline.com/?dir=desc&order=name http://www.unitedbmwonline.com/?mode=grid http://www.unitedbmwonline.com/?mode=listThanks in advance,
Steve
-
Thank you Cyrus I will certainly read the blog post and consider the noindex, nofollow on content with a canonical tag that differs from the current served page' uri.
I am still at little confused as to why the SEOMOZ crawl is highlighting duplicate pages when the canonical tag is present and pointing to the primary content.
Take the following example page for example:-
http://www.planksclothing.com/planks-classic-t-shirt-black-multi.html
Firstly the page has a canonical tag. There is no search on the site and product is viewed a root level without directory structure, which in a Magento instance is the common problem with duplicate content...
Currently at the time of writing SEOMOZ is updating my duplicate repor, so I can't find out what is the duplicate content. Maybe it is updating to say it is not
Thanks
Amendment: After reading the supplied blog post (http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world) I have learn't that the above page is just not different and probably is in the area of "Thin Content".
-
There are many, many different types of duplicate content, and how you handle it depends on the specific type of duplicate content and your needs.
If you haven't already, I highly suggest you read Dr. Pete's excellent post on dupe content here: http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
In your specific case it looks like you have multiple parameters serving the same basic content as your homepage. Is this correct?
In this case, you should set a canonical on every page pointing to the homepage. This also has the benefit of solving the errors in the SEOmoz PRO app.
It also sounds like you've addressed the issue in Google's Webmaster Tools. Unfortunately, Google doesn't let SEOmoz sync with Webmaster Tools, so anything you set there won't show up in the Web App.
Finally, don't forget about Bing Webmaster. They have similar parameter settings you can submit.
By the way, some SEOs would suggest putting meta robots "NOINDEX, FOLLOW" tags on those duplicate pages. While this may potentially send conflicting signals when coupled with the canonical tag, it is a potentially valid approach.
Hope this helps! Best of luck with your SEO.
-
This is exactly my current situation...
As a result of the SEOMOZ Duplicate content report I set about resolving these issues...
In the first instance I configured URL parameters via Google Webmaster Tools. It instantly occurred to me that whilst this fixes these potential duplicate content in Google this configuration does not affect other search engines and the work is unlikely to be reflected in future SEOMOZ crawls of the site.
I'm interested in creating a over arching method of removing the potential duplication caused via URL parameters required to paginate, sort and filter content. The majority of these URL parameters are standardized across web applications. But is it actually required?
In my case each Magento store uses the canonical tag correctly and has an updated robots.txt to restrict the crawling of areas of the site that should be excluded... In a sense this is the over arching method of removing potential duplicate content. So why is SEOMOZ reporting duplicate content?
I suppose the big question is... Is SEOMOZ crawling the site correctly, do these results reflect robots.txt and canonical tags?
-
Thank you for your thoughts.
As mentioned in my above response, canonical tags have already been configured for the site, it's just this home page that remains the issue.
-
Thanks for your response.
I looked in URL Parameters and see dir & mode are already defined.
Then I searched the http://www.unitedbmwonline.com page source for canonical links and none are defined, though I do have canonical tags setup for the rest of the site
Any other thoughts of how to remove these duplicates?
-
You can also tell Google to ignore certain query string variables through Webmaster Tools.
For instance, indicate that "dir" and "mode" have no impact on content.
Other SE's have simular controls.
-
This is why the canonical tag was invented, to solve duplicate content issues when URL parameters are involved. Set a canonical tag on all these pages to point towards the version of the page you want to appear in search results. As long as the pages are identical, or close to it, the search engines (most likely) will respect the canonical tag, and pass along the duplicate versions link juice to the page you're pointing to.
Here's some info: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html. If you Google "canonical tag", you'll find lots more!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solution to Duplicate Pages within Shopify
Thanks in advance for your time and expertise. I am having issues with duplicate page content and titles on a client's Shopify subdomain. Examples below. Two questions: #1 How can I solve this issue? Do I block the duplicate pages from being crawled? With meta NoIndex? Establish the main page as the canonical version and stop obsessing? Other... #2 Is it a big concern or am I needlessly obsessing? Feels like a concern that needs to be addressed, but maybe not? Duplicate Page Content Examples: #1 URL: http://shop.shopvandevort.com #1 Duplicate URLs: http://shop.shopvandevort.com/collections/all; http://shop.shopvandevort.com/collections/all?page=1 #2 URL: http://shop.shopvandevort.com/collections/accessories #2 Duplicate URLs: http://shop.shopvandevort.com/collections/accessories; http://shop.shopvandevort.com/collections/types?q=Accessories Duplicate Page Title Examples: http://shop.shopvandevort.com/collections/vendors?q=For%20Love%20And%20Lemons http://shop.shopvandevort.com/collections/for-love-lemons http://shopvandevort.com/blog/tag/for-love-and-lemons/ http://shop.shopvandevort.com/collections/for-love-lemons?page=1 Thanks again for taking a look here, very much appreciated.
Intermediate & Advanced SEO | | AaronHurst0 -
Woocommerce SEO & Duplicate content?
Hi Moz fellows, I'm new to Woocommerce and couldn't find help on Google about certain SEO-related things. All my past projects were simple 5 pages websites + a blog, so I would just no-index categories, tags and archives to eliminate duplicate content errors. But with Woocommerce Product categories and tags, I've noticed that many e-Commerce websites with a high domain authority actually rank for certain keywords just by having their category/tags indexed. For example keyword 'hippie clothes' = etsy.com/category/hippie-clothes (fictional example) The problem is that if I have 100 products and 10 categories & tags on my site it creates THOUSANDS of duplicate content errors, but If I 'non index' categories and tags they will never rank well once my domain authority rises... Anyone has experience/comments about this? I use SEO by Yoast plugin. Your help is greatly appreciated! Thank you in advance. -Marc
Intermediate & Advanced SEO | | marcandre1 -
Duplicate Internal Content on E-Commerce Website
Hi, I find my e-commerce pharmacy website is full of little snippets of duplicate content. In particular: -delivery info widget repeated on all the product pages -product category information repeated product pages (e.g. all medicines belonging to a certain category of medicines have identical side effects and I also include a generic snippet of the condition the medicine treats) Do you think it will harm my rankings to do this?
Intermediate & Advanced SEO | | deelo5550 -
PDF for link building - avoiding duplicate content
Hello, We've got an article that we're turning into a PDF. Both the article and the PDF will be on our site. This PDF is a good, thorough piece of content on how to choose a product. We're going to strip out all of the links to our in the article and create this PDF so that it will be good for people to reference and even print. Then we're going to do link building through outreach since people will find the article and PDF useful. My question is, how do I use rel="canonical" to make sure that the article and PDF aren't duplicate content? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
How to Remove Joomla Canonical and Duplicate Page Content
I've attempted to follow advice from the Q&A section. Currently on the site www.cherrycreekspine.com, I've edited the .htaccess file to help with 301s - all pages redirect to www.cherrycreekspine.com. Secondly, I'd added the canonical statement in the header of the web pages. I have cut the Duplicate Page Content in half ... now I have a remaining 40 pages to fix up. This is my practice site to try and understand what SEOmoz can do for me. I've looked at some of your videos on Youtube ... I feel like I'm scrambling around to the Q&A and the internet to understand this product. I'm reading the beginners guide.... any other resources would be helpful.
Intermediate & Advanced SEO | | deskstudio0 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
Duplicate content: is it possible to write a page, delete it and use it for a different site?
Hi, I've a simple question. Some time ago I built a site and added pages to it. I have found out that the site was penalized by Google and I have neglected it. The problem is that I had written well-optimized pages on that site, which I would like to use on another website. Thus, my question is: if I delete a page I had written on site 1, can use it on page 2 without being penalized by Google due to duplicate content? Please note: site one would still be online. I will simply delete some pages and use them on site 2. Thank you.
Intermediate & Advanced SEO | | salvyy0 -
Duplicate Content Issue
Why do URL with .html or index.php at the end are annoying to the search engine? I heard it can create some duplicate content but I have no idea why? Could someone explain me why is that so? Thank you
Intermediate & Advanced SEO | | Ideas-Money-Art0