Why does Crawl Diagnostics report this as duplicate content?
-
Hi guys,
we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools.
Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler.
Here's an example, taken directly from our Crawl Diagnostics Report:
URL with 4 Duplicate Content errors:
/safety-lights.htmlDuplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.htmlSo why is SEOMoz crawler still flagging this as duplicate content?
-
So glad I could help get this figured out! Sometimes it just takes another set of eyes.
-Chiaryn
-
Good catch Chiaryn! Totally didn't see this.
Essentially two URLs end up displaying the same content: 1 is the URL that's picked up by google from our XML sitemap, and the other is a dynamic URL with filtering parameters based on a one level higher category URL.
The canonical tags were set up in such a way that they point to the base category, which in this case, are different, even though the content is the same.
We will address this.
Thanks!
-
Hi there,
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing. These pages are considered duplicates because their canonical tags point to different URLs. For example, accessories/lights.html?cat=78&price=-100 is considered a duplicate of accessories/lights/safety-lights.html?manufacturer=514 because the canonical tag for the first page is accessories/lights.html while the canonical for the second URL is accessories/lights/safety-lights.html.
Since the canonical tags point to different pages it is assumed that accessories/lights.html and accessories/lights/safety-lights.html are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
- If A references B as the canonical, then they are not considered duplicates
- If A and B both reference C as canonical, A and B are not considered duplicates of each other
- If A references C as a canonical, A and B are considered duplicated
- If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.
I hope this clears things up. Please let me know if you have any other questions.
-Chiaryn
-
Does seem a little odd. Could you post the domain so we can have a more detailed look?
Thanks
Iain - Reload Media
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Moz crawl our development site?
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues. What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further. How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again? Thanks!
Moz Pro | | MultiTimeMachine0 -
Keyword Rankings Report Accuracy
How many of you routinely have inaccurate data in your Moz Pro keyword rankings reports? I just checked 5 of our terms that came in this morning - yes, it's a not logged in, non-personalized, incognito, cleared cache search - and none of them actually ranked where Moz said they ranked. One was listed in the top 5 and wasn't even on the first page. One was listed at position 3 but was actually at position 8, a big difference when it comes to CTR. And the report will have stuff like our brand name not ranked at all one week, then jumping by 45+ positions the next week, then gone the next week. And it doesn't fluctuate like that. I get that the reports are general to what most people see, but should such big disparities be expected?
Moz Pro | | Kingof50 -
Pages Crawled: 1 Why?
I have some campaigns which have only 1 page crawled, while some other campaigns, having completely similar URL (subdomain) and number of keywords and pages, have all pages crawled... Why is that so? It has been also a while I waited and so far no change...
Moz Pro | | BritishCouncil0 -
Rankings Report not working
Hi (What happened to the User Voice feedback in PRO campaign reports? Another cutback because of scalability?) I'm getting sent here for Help on the reports It's Friday morning here in France ; I'm preparing a meeting with a client and as part of this I want their Rankings Report from the campaing I set up for them month's ago Half the keywords are showing up as "SAT" ; the message at the top says "Your keywords are updated weekly on Saturday. The last update was January 26th, 2013" They're not new keywords, if I click on them I get historic data up untill January 19th I'm guessing you had a problem on January 26th but why not put January 19th rankings rather than SAT ? The report is now useless. Here's the url if useful http://pro.seomoz.org/campaigns/54444/rankings Neil
Moz Pro | | NeilInFrance0 -
Getting rid of duplicate content
Hi everyone, I'm a newbie and at the moment don't know very much about SEO. I have a problem with some of my campaigns where i keep getting a report with either Duplicate Page and/or Duplicate Content errors. I have no idea how to rectify this error, remove it or fix it on the relevant websites. Can anyone please help explain how to do this, maybe step by step? I really appreciate your views and opinions! Regards, Hugh
Moz Pro | | DigitalAcademyZA0 -
Crawl Diagnostics Warnings - Duplicate Content
Hi All, I am getting a lot of warnings about duplicate page content. The pages are normally 'tag' pages. I have some news stories or blog posts tagged with multiple 'tags'. Should I ask google not to index the tag pages? Does it really affect my site? Thanks
Moz Pro | | skehoe0 -
SEOMOZ Crawl Test
Guys I really have an issue that i know have but cannot see if that makes sense. Basically 3 months ago i did a site wide 301 from economyleasinguk.co.uk to www.economy-car-leasing.co.uk Every thing looks good get all the correct header responses , all canonicals work perfectly , Google webmaster tools is updated fetch as google bot shows the old site is 301 I tried the seomoz crawl test today on the old domain and got this message Oh no! Looks like the page you were trying to access is temporarily down which at first thought ok because the site was not there it wont do it on an old 301 domain, however i tried it on a domain i know has just been 301'd and i got this message The URL http://www.site1.com/ redirects to http://site2.com/. Do you want to crawl http://site2.com/ instead?
Moz Pro | | kellymandingo
Would you like to:
Continue with www.site1.com
Continue with site2.com I really do not know what to do, its either the redirect script is missing something however its doing what it should or the server is a problem but again its doing what it should so why would SEOMOZ not be able to crawl the old URL like it example site above. Now the strange thing is Open Site Explorer does see the 301 and asks if i want to check the new URL instead Ps the redirect is done using PHP redirect which i am asking him to change to a htaccess as its now on a apache server and was wondering if this could be an issue, all pages go to correct pages as requested Thanks in Advance1 -
Can you help me get started using the crawl diagnostics report?
After getting the crawl diagnostics report for the first time my boss and I looked over it and we have tried to fix the problems but we are stumped.I have tried and watched videos , read books, etc.. but have found nothing to help. I need assistance getting started on improving my website. Can you help?
Moz Pro | | WVInjuryLawyer0