Why doesn't Moz crawler follow robots.txt?
-
It is crawling the entire site, and there is stuff we do not want it to. Please advise.
-
Which I am ok with, but why am I getting duplicate content?
-
Yes, it doesn't tell them which pages not to crawl - just not to index them
-
It has been used correctly. The site is a Magento site and they have it built in. There are a lot of filters for products so it uses rel=canonical to tell Google which to index.
-
rel=canonical is not really an robots instruction file - rel=canonical is to help with duplicate copy where you have the same or similar pages and your telling search engines which pages is the preferred page.
If you don't want pages crawling you have to tell Search engines in the robots file
-
Hi There,
Rel=canonical tags tell robots, which page is actually to index out of many.
For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem.
https://mza.seotoolninja.com/learn/seo/canonicalization
I feel you have not used it correctly, check the above article and see if it helps.
Thanks,
Vijay
-
So I made a mistake it isn't the robots.txt that is the issue. I am getting hit with a ton of duplicate content penalties so I figured that was it. The problem is that I have pages with rel=canonical tags that it is ignoring. Does Roger not read those?
-
Hi
Have to agree with the above, Rogerbot does listen to robot.txt file, unlike Bing - while they are getting better Bing ignores the robots.txt file frequently.
Ive analysed quite a few server logs over the years and Roger has always listened to the file - its usually a mistake the in the robots file.
There is an option to test your robots.txt file in GCS - while this is testing to see if Google will crawl the page - usually Roger has the same instructions as Google.
However if you are still pretty certain that Roger is ignoring robots.txt please DM your Server Logs and your website and I will take a look and analyse it for you (free of course).
Thanks
Andy
-
All major search engines, including Moz's crawler Rogerbot and Internet Archives, respect Robots.txt as a standard “robots exclusion protocol” to communicate with web crawlers and web robots.
In case you wish to exclude some specific information from all Search Engines, you can use the following sample code as reference to block specific directories.
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/However, if you want to specifically block Mz's Rogerbot from crawling specific sections of your website. You may take the following reference code to block specific areas / directories in your website from rogerbot:
User-agent: Rogerbot
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/I hope this helps, If you have specific questions, please feel free to respond, I will be happy to answer them.
Regards,
Vijay
-
Hi there! Moz's crawler, rogerbot, does follow robots.txt. When he's not following robots.txt, it's usually because the robots.txt protocol is formatted improperly. Learn more about formatting your page here: https://mza.seotoolninja.com/learn/seo/robotstxt
For more information on Roger, including how to block him, head here: https://mza.seotoolninja.com/help/guides/moz-procedures/what-is-rogerbot
And if you want to test your formatting, try the Robots Checker here: https://support.google.com/webmasters/answer/6062598
If you're still unable to determine why rogerbot is crawling your site, feel free to write in to [email protected]!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Why is Moz so bad at finding lost backlinks?
0 -
My Website's Links Are Not Showing In Moz Link Explorer
Hello Team, I was analyzing my domain (https://www.amzonestep.com) in link explored but there are many websites that are not showing there. Questions
Link Explorer | | amzonestep
1. Is this reason website's DA is not increasing 2. Is there wrong steps taken by me for this website. Well, I know there are many factors work in increasing DA but what is should do if these kind on links are not indexing in MOZ. I things these are one of the factors my DA is not Increasing. Please save my job guys. The company has sent me an ultimatum mail of 2 months that they will take back the projects from me. Please save my job. Thanks & Regards Aashirvad Kumar moz.png Screenshot-2020-02-19-at-11.48.21-am.png0 -
How is Moz DA affected by spam links? Disavow file?
So it does not appear that moz let's you upload your disavow file. So when moz calculates your DA how do spammy links factor in? After digging through our GA it appears our site was hit with the 2016 penguin update and never recovered. Our weekly visitors were 2k, then dropped to 500 and have stayed close to that level for a while. We've used the disavow tool, without success over the past 3 years. During that time we have done link out reach and built around 10 legit good quality DA links since. But we have not recovered. At this point i'm thinking I should just remove the disavow file. Moz says our spam score for our domain is 5%.
Link Explorer | | jessicapremier0 -
Open Site Explorer external + follow links percentage
Hi how are you my root domain according to open site explorer has 90% of total links, external + follow. whereas my competitors have 5 - 6% how do I get this down to look more natural. Also what does this metric mean and how do I work out the percentage? Also I only have around 100 pages on my website which is a Shopify store and I have a small amount of internal followed links is this important for an ecommerce website as it is small number in comparison to my competitors. Thank you regards Adam
Link Explorer | | hourspy1 -
Error Message on Moz Crawler
Hi all, Just ran into this issue, when analysing this site. Just got this message when using MOZ "Page Optimisation Error". Anyone know why? It seems to be working fine on other SEO analyser tools. Website is: www.sbpcreativemedia.com.au Thanks in advance! luXS8V5
Link Explorer | | Dushala0 -
What's the Story on Mozscape Updates?
Hey gang, As you may be aware, we were considerably late with our last index release. You have my sincere apologies for that and the apologies of the entire team. In the interest of transparency, I want to try to explain what's been going on. Since stepping down as CEO, I've been asked to take on a few roles in the company. One of those is product architect (basically the product owner) of our Big Data team, who produces the Mozscape link index. For several years, that team has been almost exclusively focused on getting us closer to a near real-time indexing system that does not have scalability issues. Mozscape is currently smaller than our major competitors, and we're also often slower. Our metrics (PA, DA, MozRank, MozTrust, Spam Score, Social Data, etc) have been the unique value we provide, but it's not enough. We need to be competitive on size and freshness. Building a raw link index (without processed metrics like PA/DA et al) is hard, but it's possible. Building a link index with those metrics is really tricky, and requires computer science knowledge and skills far beyond the scope of my understanding. That's what our team's been working on, and they've made some progress, but it's been slow, hampered by unknown unknowns, and materially hurt by a lack of experienced talent we can hire to help (we've had open job posts for years now). In the meantime, our historic Mozscape index structure keeps encountering challenges - this latest round is still somewhat unexplained (we believe there's hardware issues compounded by how the system is architected to handle large domains, but there may be other issues). The team's struggled to split time between keeping the old Mozscape running and hunkering down to finish the new system. I'm trying to help them balance things as best I can, and we're going to be putting effort toward making sure we get index releases out on time. However, to do that, we'll need to scale down size, and then rebuild back up. We think we can do this while also improving the prioritization of which links we crawl (e.g. deeper on important domains that link out, less so on deep pages that don't link anywhere) so the index overall improves. However, I don't want to minimize the risks - we may have some slow updates, some smaller indices, and some less-than-ideal data in the next one or two indices while we work to remedy this issue. I HOPE we don't, and that things actually get better immediately, but we can't promise that until the work gets finished. TL;DR - Mozscape V2 is in development and will let us as big and faster as any link index. In the meantime, current Mozscape's having issues & we're making smaller indices in an attempt to diagnose and repair. As always, thanks for your understanding, continued support, and if you have any questions, feel free to leave them below. I realize that this level of service/product quality is NOT OK, and I'm doing everything in my power to fix it.
Link Explorer | | randfish8 -
How long does it take for moz to index new domain?
Hi, Two months ago we changed our site www.jicht.nl to jicht.nl. Everthing is redirected properly but the page authority remains 1. How long does is take rogerbot to index the site without www? Google rankings are also dropping so there might be a bigger issue. Thanx
Link Explorer | | AlkaVitae0