Crawl Diagnostics bringing 20k+ errors as duplicate content due to session ids
-
Signed up to the trial version of Seomoz today just to check it out as I have decided I'm going to do my own SEO rather than outsource it (been let down a few times!). So far I like the look of things and have a feeling I am going to learn a lot and get results.
However I have just stumbled on something. After Seomoz dones it's crawl diagnostics run on the site (www.deviltronics.com) it is showing 20,000+ plus errors. From what I can see almost 99% of this is being picked up as erros for duplicate content due to session id's, so i am not sure what to do!
I have done a "site:www.deviltronics.com" on google and this certainly doesn't pick up the session id's/duplicate content. So could this just be an issue with the Seomoz bot. If so how can I get Seomoz to ignore these on the crawl?
Can I get my developer to add some code somewhere.
Help will be much appreciated. Asif
-
Hello Tom and Asif,
First of all Tom thanks for the excellent blog post re google docs.
We are also using the Jshop platform for one of our sites. And am not sure whether it is working correctly in terms of SEO. I just ran an seomoz crawl of the site and found that every single link in the list has a rel canonical in it, even the ones with session id's.
Here is an example:
www.strictlybeautiful.com/section.php/184/1/davines_shampoo/d112a41df89190c3a211ec14fdd705e9
www.strictlybeautiful.com/section.php/184/1/davines_shampoo
As Asif has pointed out the Jshop people say they have programmed it so that google cannot pick up the session ids, firstly is that even possible? And if I assume thats not an issue then what about the fact that every single page on the site has a rel canonical link on it?
Any help would be much appreciated.
<colgroup><col width="1074"></colgroup>
| |
| | -
Asif, here's the page with the information on the SEOmoz bot.
-
Thanks for the reply Tom. Spoke to our developer he has told me that the website platform (Jshop) does not show session ID's to the search engines so we are ok on that side. However as it doesn't recognise the Seomoz bot it shows it the session ID's. Do you know where I can find info on the Seomoz bot so we can see what it identifies itself as so it can be added to the list of recognised spiders?
Thanks
-
Hi Asif!
Firstly - I'd suggest that as soon as possible you address the core problem - the use of session ids in the URL. There are not many upsides to the approach and there are many downsides.That it doesn't show up with the site: command doesn't mean it isn't having a negative impact.
In the meantime, you should add a rel=canonical tag to all the offending pages pointing to the URL without the session id. Secondly, you could use robots.txt to block the SEOmoz bot from crawling pages with session ids, but it may affect the bots ability to crawl the site if all the links it is presented with are with session ids - which takes us back around to fixing the core problem.
Hope this helps a little!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content/Missing Meta Description | Pages DO NOT EXISIT!
Hello all, For the last few months, Moz has been showing us that our site has roughly 2,000 duplicate content errors. Pages that were actually duplicate content, I took care of accordingly using best practice (301 redirects, canonicalization,etc.). Still remaining after these fixes were errors showing for pages that we have never created. Our homepage is www.primepay.com. An example of pages that are being shown as duplicate content is http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/payroll/online-payroll with a referring page of http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/online-payroll. Some of these are even now showing up as 403 and 404 errors. The only real page on our site within that URL strand is primepay.com/payroll or primepay.com/payroll/online-payroll. Therefore, I am not sure where Moz is getting these pages from. Another issue we are having in relation to duplicate content is that moz is showing old campaign url’s tacked on to our blog page i.e. http://primepay.com/blog?title=&page=2&utm_source=blog&utm_medium=blogCTA&utm_campaign=IRSblogpost&qt-blog_tabs=1. As of this morning, our duplicate content went from 2,000 to 18,000. I exported all of our crawl diagnostics data and looked to see what the referring pages were, and even they are not pages that we have created. When you click on these links, they take you to a random point in time from the homepage of our blog; some dating back to 2010. I checked our crawl stats in both Google and Bing’s Webmaster tool, and there are no duplicate content or 400 level errors being reporting from their crawl. My team is truly at a loss with trying to resolve this issue and any help with this matter would be greatly appreciated.
Moz Pro | | PrimePay0 -
Tags on my website cause duplicate content
Hi I just recently started a website and I am new to MOZ pro. What Moz pro detected on my website under high priority is that "duplicate page content" and what I realize about these duplicate page content is regarding the tags i put on my post. Because it is a wordpress blog, we are allow to add tags on the side before we publish our post. And because of these tags, it linked to the same page but different url. for example website.com/tags/whatever website.com/tags/whatever 2 and both these url direct to the same page So how do i solve this? do i just stop tagging whenever i write a post? delete all tags while it is not necessary? i seen method like 301 redirect or rel=canonical but is there anyway to solve this problem so I do not face this issue whenever i make a new post in my blog? I mean it doesnt make sense to redirect 301 to every single tags i have whenever i write a new post right? thanks guys
Moz Pro | | andzon0 -
Ajax4SEO and rogerbot crawling
Has anyone had any experience with seo4ajax.com and moz? The idea is that it points a bot to a html version of an ajax page (sounds good) without the need for ugly urls. However, I don't know how this will work with rogerbot and whether moz can crawl this. There's a section to add in specific user agents and I've added "rogerbot". Does anyone know if this will work or not? Otherwise, it's going to create some complications. I can't currently check as the site is in development and the dev version is noindexed currently. Thanks!
Moz Pro | | LeahHutcheon0 -
Crawl Report Warnings
How much notice should be paid to the warnings on the SEO Moz crawl reports? We manage a fairly large property site and a lot of the errors on the crawl reports relate to automated responses. As a matter of priority which of the list below will have negative affects with the search engines? Temporary RedirectToo Many On-Page LinksOverly-Dynamic URLTitle Element Too Long (> 70 Characters)Title Missing or EmptyDuplicate Page ContentDuplicate Page TitleMissing Meta Description Tag
Moz Pro | | SoundinTheory0 -
Sub-domain not crawled
One of our sites was recently re-designed. The home page is a landing page (www.labadieauto.com) and I moved the blog to this domain (labadieauto.com/blog/) and put a link is the bottom left of the home page. Since the change the SEOMOZ campaign overview is showing only 1 page crawled. This is not setup as a sub-domain so why isn't it showing in the crawl? Help!
Moz Pro | | LabadieAuto0 -
Errors went from 2420 to ZERO
Of course this happened without my intervention, i don't know why but seomoz is reporting 0 errors.
Moz Pro | | iFix0 -
How long should the weekly crawl take
Mine started yesterday afternoon and it's now almost 11pm on Sunday. 30+ hours and still not finished (and no progress indicator). 438 pages quoted as being crawled. That's not normal - right? I have made a bunch of changes based on last weeks crawl so I have been eagerly waiting for this to finish But 30 hours?.... Thanks. Mark
Moz Pro | | MarkWill0 -
What the . . ! Duplicate Pages and Titles WAY up?
My duplicate pages went up 50 plus in the past week, and my duplicate page titles went over more then 100. We recently redesigned the website, but it has been up for several weeks now. The only change I made specifically last week or late the week before was to get my 301 redirects done to get the www. version and the non www version pointing to the same place (as well as a couple other sites that point to it). I'm sure this is not enough info to figure out what went wrong . . . I'd love some help in figuring this out though.
Moz Pro | | damon12120