Why doesn't Moz crawler follow robots.txt?
-
It is crawling the entire site, and there is stuff we do not want it to. Please advise.
-
Which I am ok with, but why am I getting duplicate content?
-
Yes, it doesn't tell them which pages not to crawl - just not to index them
-
It has been used correctly. The site is a Magento site and they have it built in. There are a lot of filters for products so it uses rel=canonical to tell Google which to index.
-
rel=canonical is not really an robots instruction file - rel=canonical is to help with duplicate copy where you have the same or similar pages and your telling search engines which pages is the preferred page.
If you don't want pages crawling you have to tell Search engines in the robots file
-
Hi There,
Rel=canonical tags tell robots, which page is actually to index out of many.
For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem.
https://mza.seotoolninja.com/learn/seo/canonicalization
I feel you have not used it correctly, check the above article and see if it helps.
Thanks,
Vijay
-
So I made a mistake it isn't the robots.txt that is the issue. I am getting hit with a ton of duplicate content penalties so I figured that was it. The problem is that I have pages with rel=canonical tags that it is ignoring. Does Roger not read those?
-
Hi
Have to agree with the above, Rogerbot does listen to robot.txt file, unlike Bing - while they are getting better Bing ignores the robots.txt file frequently.
Ive analysed quite a few server logs over the years and Roger has always listened to the file - its usually a mistake the in the robots file.
There is an option to test your robots.txt file in GCS - while this is testing to see if Google will crawl the page - usually Roger has the same instructions as Google.
However if you are still pretty certain that Roger is ignoring robots.txt please DM your Server Logs and your website and I will take a look and analyse it for you (free of course).
Thanks
Andy
-
All major search engines, including Moz's crawler Rogerbot and Internet Archives, respect Robots.txt as a standard “robots exclusion protocol” to communicate with web crawlers and web robots.
In case you wish to exclude some specific information from all Search Engines, you can use the following sample code as reference to block specific directories.
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/However, if you want to specifically block Mz's Rogerbot from crawling specific sections of your website. You may take the following reference code to block specific areas / directories in your website from rogerbot:
User-agent: Rogerbot
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/I hope this helps, If you have specific questions, please feel free to respond, I will be happy to answer them.
Regards,
Vijay
-
Hi there! Moz's crawler, rogerbot, does follow robots.txt. When he's not following robots.txt, it's usually because the robots.txt protocol is formatted improperly. Learn more about formatting your page here: https://mza.seotoolninja.com/learn/seo/robotstxt
For more information on Roger, including how to block him, head here: https://mza.seotoolninja.com/help/guides/moz-procedures/what-is-rogerbot
And if you want to test your formatting, try the Robots Checker here: https://support.google.com/webmasters/answer/6062598
If you're still unable to determine why rogerbot is crawling your site, feel free to write in to [email protected]!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz isn't crawling all my backlinks.
Moz isn't crawling all my backlinks. It's showing only 29 referring domain when I have more than 200 referring domains linking to my website. My website URL is 360gisthub.com.ng 360gisthub.com.ng
Link Explorer | | Mustybay0 -
Error Message on Moz Crawler
Hi all, Just ran into this issue, when analysing this site. Just got this message when using MOZ "Page Optimisation Error". Anyone know why? It seems to be working fine on other SEO analyser tools. Website is: www.sbpcreativemedia.com.au Thanks in advance! luXS8V5
Link Explorer | | Dushala0 -
Moz Pro: Filter inbound links by partial anchor text?
My site has been targeted by a spam farm with hundreds of different domains, all linking to images on our CDN with similar variations of anchor text, eg: get free high quality hd wallpapers wedding cake makers
Link Explorer | | James_NZ
get free high quality hd wallpapers hairstyle makeover
get free high quality hd wallpapers living room cafe
etc Is it possible within Moz Pro to filter all incoming links with anchor text including "free high quality hd wallpapers" so that I can disavow all of the domains en masse? So far I've only been able to display/download the list of links exactly matching the full anchor text which is very time-consuming with 100+ variations. Regards,
James0 -
Angular SPA & MOZ Crawl Issues
Website: https://www.exambazaar.com/ Issue: Domain Authority & Page Authority 1/100 I am using Prerender to cache/render static pages to crawl agents but MOZ is not able to crawl through my website (https://www.exambazaar.com/). Hence I think it has a domain authority of 1/100. I have been in touch with Prerender support to find a fix for the same and have also added dotbot to the list of crawler agents in addition to Prerender default list which includes rogerbot. Do you have any suggestions to fix this? List: https://github.com/prerender/prerender-node/commit/5e9044e3f5c7a3bad536d86d26666c0d868bdfff Adding dotbot to Express Server:
Link Explorer | | gparashar
prerender.crawlerUserAgents.push('dotbot');0 -
Moz Pro Tools Inbound Links
Is there a way to get a date of when a inbound link was created from a external website. And if so is there a way to add that to the Moz Pro tools reports on a csv file.
Link Explorer | | willakawillow221 -
Different numbers in Moz bar vs. OSE
Hi, On a Google SERP, when i activate the MOZ toolbar one of the sites says it has 1,258 links / 667 RDs. But when I open this same site up in OSE I get 3 inbound links (see attch). What's going on? 0Au9H
Link Explorer | | sanjosepainting0 -
Why Moz Back links check tool is very Slow compared with other tools?
I've created a new website 50 days old with a vision of sharing high-quality content. I created some pages which are useful and helpful, what make other people link to my site. Early days, two weeks ago I've started checking my backlinks starting with Ahref and majestic. three days ago I decided to join Moz too and use Moz explorer. But, I discovered that Moz tool is too slow compared with other tools. Why? Starting with Ahref. They discovered more than 100 links and updating their data for my site as a new website every 6 hours as I know. Also majestic for the first time discovered six links then every two days they updating their data. Even Google discovered some of my backlinks. Specially links from high-quality sites such as Wikipedia with high-quality pages. Moz tool discovered zero of my backlinks even after subscribing. I don't think Moz need to discover websites like Wikipedia which already indexed by Moz. So, I wonder Why Moz tools are very very very slow like that? Update: (This is a question, not a discussion. So, please don't edit it) Can any Moz support answer please? Thanks.
Link Explorer | | Eslam-yosef0 -
Open Site Explorer doesn't show local directories or citation links?
Why is it that when I run a link analysis for my local business sites I don't see any links from Yelp, Google+, Facebook, any local citations? Regardless of them being follow or no follow isn't the hyperlink to a website from a Yelp profile considered a link? Same with any other local business directory? I know the company is listed on a few dozen because I can see it manually but it's not showing up in Moz? Why is it that all local directories and citations with website are omitted? And since it is not there, where is a good place to go to discover these things? I have tried whitespark. It was okay but honestly I wasn't terribly impressed. The data was good, the UI was just okay, but with smaller clients it was an initial check and then very occasional checks that needed to happen. We would go months without using it.
Link Explorer | | bricegump0