Why does Crawl Diagnostics report this as duplicate content?
-
Hi guys,
we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools.
Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler.
Here's an example, taken directly from our Crawl Diagnostics Report:
URL with 4 Duplicate Content errors:
/safety-lights.htmlDuplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.htmlSo why is SEOMoz crawler still flagging this as duplicate content?
-
So glad I could help get this figured out! Sometimes it just takes another set of eyes.
-Chiaryn
-
Good catch Chiaryn! Totally didn't see this.
Essentially two URLs end up displaying the same content: 1 is the URL that's picked up by google from our XML sitemap, and the other is a dynamic URL with filtering parameters based on a one level higher category URL.
The canonical tags were set up in such a way that they point to the base category, which in this case, are different, even though the content is the same.
We will address this.
Thanks!
-
Hi there,
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing. These pages are considered duplicates because their canonical tags point to different URLs. For example, accessories/lights.html?cat=78&price=-100 is considered a duplicate of accessories/lights/safety-lights.html?manufacturer=514 because the canonical tag for the first page is accessories/lights.html while the canonical for the second URL is accessories/lights/safety-lights.html.
Since the canonical tags point to different pages it is assumed that accessories/lights.html and accessories/lights/safety-lights.html are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
- If A references B as the canonical, then they are not considered duplicates
- If A and B both reference C as canonical, A and B are not considered duplicates of each other
- If A references C as a canonical, A and B are considered duplicated
- If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.
I hope this clears things up. Please let me know if you have any other questions.
-Chiaryn
-
Does seem a little odd. Could you post the domain so we can have a more detailed look?
Thanks
Iain - Reload Media
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to include in my report ?
Hi everybody, I'm not a SEO expert just an e-marketing trainee and i have to create a smart SEO report template for a real estate company. I don't know what to include in this personal monthly report : What are the key informations ?
Moz Pro | | grafmiville
Should I include Google Analytics data and wich one? Sorry for my english that's not my mother tongue. Thanks for your answers0 -
Duplicate Content errors - not going away with canonical
I am getting Duplicate Content Errors reported by Moz on search result pages due to parameters. I went through the document on resolving Duplicate Content errors and implemented the canonical solution to resolve it. The canonical in the header has been in place for a few weeks now and Moz is still showing the pages as Duplicate Content despite the canonical reference. Is this a Moz bug? http://mathematica-mpr.com/news/?facet={81C018ED-CEB9-477D-AFCC-1E6989A1D6CF}
Moz Pro | | jpfleiderer0 -
Rank Tracker report truble
Hello, I am new at SEOmoz, and I am having some troubles setting up my rank tracking report. Last week I set up e-mail notifications, but it hasn’t been delivered to this day. Can anybody tell me when do you usually receive the report via e-mail? Thank you.
Moz Pro | | Spletnafuzija0 -
Crawl Diagnostics Error Spike
With the last crawl update to one of my sites there was a huge spike in errors reported. The errors jumped by 16,659 -- majority of which are under the duplicate title and duplicate content category. When I look at the specific issues it seems that the crawler is crawling a ton of blank pages on the sites blog through pagination. The odd thing is that the site has not been updated in a while and prior to this crawl on Jun 4th there were no reports of these blank pages. Is this something that can be an error on the crawler side of things? Any suggestions on next steps would be greatly appreciated. I'm adding an image of the error spike Xovep.jpg?1 Xovep.jpg?1
Moz Pro | | VanadiumInteractive1 -
Duplicate Content Issue from using filters on a directory listing site
I have a directory listing site of harpists and have alot of issues coming up that say: Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings. Because this is a directory listing site the content is quite generic.The main issue appears to be coming from the functionality of the page. It appears that the "spider" is picking up each different choice of filter as a new page? If you have a look at this link you will see what I mean. People searching the site can filter the results of the songs played by this harpist by changing the dropdowns etc... but for some reason the filter arguments are being picked up...? Do you have any good approaches to solving this issue? A similar issue comes from the video pages for each harpist. They are being flagged as identical content - as there are currently no videos on the page. | http://www.find-a-harpist.co.uk/user/39/videos | http://www.find-a-harpist.co.uk/user/37/videos | Do you have any suggestions? Many thanks for taking the time to read this and respond. | | | | | |
Moz Pro | | dseo241
| |0 -
Dynamic URL pages in Crawl Diagnostics
The crawl diagnostic has found errors for pages that do not exist within the site. These pages do not appear in the SERPs and are seemingly dynamic URL pages. Most of the URLs that appear are formatted http://mysite.com/keyword,%20_keyword_,%20key_word_/ which appear as dynamic URLs for potential search phrases within the site. The other popular variety among these pages have a URL format of http://mysite.com/tag/keyword/filename.xml?sort=filter which are only generated by a filter utility on the site. These pages comprise about 90% of 401 errors, duplicate page content/title, overly-dynamic URL, missing meta decription tag, etc. Many of the same pages appear for multiple errors/warnings/notices categories. So, why are these pages being received into the crawl test? and how to I stop it to gauge for a better analysis of my site via SEOmoz?
Moz Pro | | Visually0 -
Duplicate page content and search in Magento
Hi all, Firstly, I am a business owner and not a SEO genuis but I work on my site and am learning how to "tweek" everyday. That said, my site www.vintagetimes.com.au needs a bit more than a tweek. Here is problem 1: I have massive duplicate page content which is being driven primarily by search and I'm not sure how to tackle the issue. Working in Magento. Could anybody give me an instruction on how to steer robots away from search results? I would also like to know WHY a search result is here as well? Example of about 20 pages of this type of result: | Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21&order=created_at&dir=asc 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21&order=metal&dir=asc 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21&order=name&dir=asc 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21&order=price&dir=asc 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21&order=relevance&dir=asc 50+ 1 0 Search results for: '1 carat' Vintage Times http://www.vintagetimes.com.au/catalogsearch/result/index/?q=1+carat&enable_googlecheckout=1&cat=21&order=stone&dir=asc | 50+ | 1 | 0 |
Moz Pro | | VintageTimesAustralia0 -
SEOMoz's Crawl Diagnostics showing an error where the Title is missing on our Sitemap.xml file?
Hi Everyone, I'm working on our website Sky Candle and I've been running it as a campaign in SEOmoz. I've corrected a few errors we had with the site previously, but today it's recrawled and found a new error which is a missing Title tag on the sitemap.xml file. Is this a little glitch in the SEOmoz system? Or do I need to add a page title and meta description to my XML file. http://www.skycandle.co.uk/sitemap.xml Any help would be greatly appreciated. I didn't think I'd need to add this. Kind Regards Lewis
Moz Pro | | LewisSellers0