Crawl Diagnostics bringing 20k+ errors as duplicate content due to session ids
-
Signed up to the trial version of Seomoz today just to check it out as I have decided I'm going to do my own SEO rather than outsource it (been let down a few times!). So far I like the look of things and have a feeling I am going to learn a lot and get results.
However I have just stumbled on something. After Seomoz dones it's crawl diagnostics run on the site (www.deviltronics.com) it is showing 20,000+ plus errors. From what I can see almost 99% of this is being picked up as erros for duplicate content due to session id's, so i am not sure what to do!
I have done a "site:www.deviltronics.com" on google and this certainly doesn't pick up the session id's/duplicate content. So could this just be an issue with the Seomoz bot. If so how can I get Seomoz to ignore these on the crawl?
Can I get my developer to add some code somewhere.
Help will be much appreciated. Asif
-
Hello Tom and Asif,
First of all Tom thanks for the excellent blog post re google docs.
We are also using the Jshop platform for one of our sites. And am not sure whether it is working correctly in terms of SEO. I just ran an seomoz crawl of the site and found that every single link in the list has a rel canonical in it, even the ones with session id's.
Here is an example:
www.strictlybeautiful.com/section.php/184/1/davines_shampoo/d112a41df89190c3a211ec14fdd705e9
www.strictlybeautiful.com/section.php/184/1/davines_shampoo
As Asif has pointed out the Jshop people say they have programmed it so that google cannot pick up the session ids, firstly is that even possible? And if I assume thats not an issue then what about the fact that every single page on the site has a rel canonical link on it?
Any help would be much appreciated.
<colgroup><col width="1074"></colgroup>
| |
| | -
Asif, here's the page with the information on the SEOmoz bot.
-
Thanks for the reply Tom. Spoke to our developer he has told me that the website platform (Jshop) does not show session ID's to the search engines so we are ok on that side. However as it doesn't recognise the Seomoz bot it shows it the session ID's. Do you know where I can find info on the Seomoz bot so we can see what it identifies itself as so it can be added to the list of recognised spiders?
Thanks
-
Hi Asif!
Firstly - I'd suggest that as soon as possible you address the core problem - the use of session ids in the URL. There are not many upsides to the approach and there are many downsides.That it doesn't show up with the site: command doesn't mean it isn't having a negative impact.
In the meantime, you should add a rel=canonical tag to all the offending pages pointing to the URL without the session id. Secondly, you could use robots.txt to block the SEOmoz bot from crawling pages with session ids, but it may affect the bots ability to crawl the site if all the links it is presented with are with session ids - which takes us back around to fixing the core problem.
Hope this helps a little!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Codeigniter - Controller and duplicate pages
Hi there, I use Codeigniter as framework and I have a question about the duplicate page. Actually, for default, the typical page in a CodeIgniter framework is something like this: http://www.domain.com/site/contact where site is the controller containing the contact function that point to the contact.html view... To have a better URL I use a trick with the "routes" that redirect any http://www.domain.com/contact to the original http://www.domain.com/site/contact Of course the both are valid and the both are... crawled! So I get the duplicate page. Is this something I have to manage, maybe with .htaccess? Any idea would be very appreciated. Thanks for you precious time guys! Shella
Moz Pro | | CarloShellaMascella0 -
Question about Crawl Diagnostics - 4xx (Client Error) report
Hi here, I was wondering if there is a way to find out the originating page where a broken link is found from the 4xx (Client Error) report. I can't find a way to know that, and without that information is very difficult for me to fix any possible 404 related issues on my website. Any thoughts are very welcome! Thank you in advance.
Moz Pro | | fablau0 -
Still Cant Crawl My Site
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us. I did a fetch as google in our WM tools on our robots txt with success. SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there. What is going on here?
Moz Pro | | martJ0 -
Duplicate content error?
I am getting a duplicate content error for the following pages: http://www.bluelinkerp.com/products/accounting/index.asp http://www.bluelinkerp.com/products/accounting/ But, of course, the 2nd link is just an automatic redirect to the index file, is it not? Why is it thinking it is a different URL? See image. NJfxA.png
Moz Pro | | BlueLinkERP0 -
How to resolve Duplicate Content crawl errors for Magento Login Page
I am using the Magento shopping cart, and 99% of my duplicate content errors come from the login page. The URL looks like: http://www.site.com/customer/account/login/referer/aHR0cDovL3d3dy5tbW1zcGVjaW9zYS5jb20vcmV2aWV3L3Byb2R1Y3QvbGlzdC9pZC8xOTYvY2F0ZWdvcnkvNC8jcmV2aWV3LWZvcm0%2C/ Or, the same url but with the long string different from the one above. This link is available at the top of every page in my site, but I have made sure to add "rel=nofollow" as an attribute to the link in every case (it is done easily by modifying the header links template). Is there something else I should be doing? Do I need to try to add canonical to the login page? If so, does anyone know how to do it using XML?
Moz Pro | | kdl01 -
Campaign Crawl Report
Hello, Just a quicky, is there anyway I can do a crawl report for something in a campaign so I can compare the changes? I know you can do a separate crawl test, but it wont show the differences,and the next crawl date isnt untill the 28th.
Moz Pro | | Prestige-SEO0 -
Duplicate page content reports duplicates, but pages don't show duplication
My duplicate page reports shows 376 pages with duplicate content. After reviewing the pages the report claims have duplicate content, i can't find duplications. could this be an error, or is there some source code that doesn't display that could be causing this issue?
Moz Pro | | noonzie0