Unsolved Site Crawler not working but on-demand crawler working
-
Hi,
In Moz pro, when using Site crawler (or recrawl), we are seeing message site is banned. But when using on-demand crawler, it could generate report successfully.
I just like to know if in both these cases, it is roberbot that is used!
And kindly note, site crawler was perfectly working before. So the required setup is already in place from long time. Site crawler ban issue started appearing from nov/dec 2023. .
Could you please us understand how could we possibly make site-crawler work?
I am happy to provide more details if you need any.Thanks
-
Hi,
This question requires help from MozPro.
Site Crawler is not working because it is missing request header 'user-agent' when we investigated the logs in our system and it got banned because of this reason.
On-demand crawler is still working because it has request header 'user-agent' and our system approved it hence able to generate report.Could you please look into this issue of no-user-agent request header?
Your response is much appreciated.Thanks
-
Hi,
I will double-check with firewall settings in our servers. Could you please share moz-pro site-crawler roger bot IP addresses/range? We will verify against our firewall rules.
Thanks
Shashi -
I am looking for roger bot site crawler IP addresses Please provide.
Thanks
-
@Aditi_08
Could you please help me on how to get IP addresses of Site Crawler? Just please note, Site Crawler is working before November so IP addresses were not blocked.Like it is mentioned before,
- no change in robots.txt
- no issue with rate limiting
- no changes in site-crawler configuration
-
@gilesd If you're experiencing issues with Moz Pro's Site Crawler showing that the site is banned while the On-Demand Crawler works fine, it might be due to changes or updates since November/December 2023. Both tools likely use the same crawler, "rogerbot," but differ in their operational schedules. The problem could be due to rate limiting or blocking by your server, IP blocking, changes in your robots.txt file, or updates in the Site Crawler configuration. To resolve this, check your robots.txt file to ensure it allows Moz's crawler, review server logs and firewall settings to ensure the crawler’s IP addresses aren’t blocked, and adjust rate limiting settings if necessary. Also, double-check the settings in Moz Pro to make sure there are no configurations causing the issue. If the problem persists, contact Moz support with detailed information about the error messages and any recent changes to your site’s configuration. Regular monitoring of your site’s interactions with automated tools and coordinating with your hosting provider can help prevent such issues in the future.
-
I am not sure why my reply not appearing here. Just for confirmation, replying again,
I like to confirm you -
There is no modification in Robots.txt
No issues with rate limit
Moz Pro settings are not changedWe are looking for your help to identify the issue.
Thanks
-
Thanks for your trouble shooting tips.
I assure you there has been nothing changed in robots.txt file or any settings in MozPro.
And there is frequency limit, Site Crawler triggers only once in 2 weeks.Thanks
-
Hi, gilesd
In Moz Pro, when using the Site Crawler or Recrawl, we also received a message indicating the site was banned. However, the on-demand crawler could generate the report successfully.
To address your question:
Robots.txt Configuration: Both the Site Crawler and on-demand crawler should be using the same robots.txt file unless there's been a recent change. Ensure your robots.txt hasn't been updated to block specific user agents.
IP Blocking or Rate Limiting: Some web servers or security settings might block or limit access based on IP or request frequency. The Site Crawler might be hitting these limits, whereas the on-demand crawler, being less frequent, avoids these blocks.
Moz Pro Settings: Double-check the Moz Pro settings to see if there have been any changes or updates to how the Site Crawler operates compared to the on-demand crawler. Any recent updates might have altered how the Site Crawler interacts with your site.
Thanks,
Hamza Zubair -
Hi, gilesd
In Moz Pro, when using the Site Crawler or Recrawl, we also received a message indicating the site was banned. However, the on-demand crawler could generate the report successfully.
To address your question:
Robots.txt Configuration: Both the Site Crawler and on-demand crawler should be using the same robots.txt file unless there's been a recent change. Ensure your robots.txt hasn't been updated to block specific user agents.
IP Blocking or Rate Limiting: Some web servers or security settings might block or limit access based on IP or request frequency. The Site Crawler might be hitting these limits, whereas the on-demand crawler, being less frequent, avoids these blocks.
Moz Pro Settings: Double-check the Moz Pro settings to see if there have been any changes or updates to how the Site Crawler operates compared to the on-demand crawler. Any recent updates might have altered how the Site Crawler interacts with your site.
Thanks,
Hamza Zubair
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Why doesn't moz notify me of missing image alt tags
We had a client come to us and let us know another vendor had notified them that many of the images on their site are missing alt tags / text. I know this was a big deal back in the day, but I haven't heard much about it lately. I am assuming if it doesn't even show up in the Moz site crawl, it must not be a big deal any more, but I would love to have more info about how important image alt tags are and if they are important, why Moz does not report them.
Moz Pro | | CaliberMG1 -
Unsolved Site Crawl Stalled and Can't Restart
In my GreenSeed campaign, the site crawl continues to say "in progress." I can't figure out how to stop it or how to restart the site crawl. Can you please help?
Moz Pro | | Winger1 -
Solved Site Crawl Won't Complete
How can I start/restart a new site crawl? I requested one 2 days ago on one of my sites, and it won't complete. It's only 150 pages -
Product Support | | PaulBarrs0 -
Solved Why is MOZ crawl taking so long?
I began my site crawl on November 3rd and now it is November 7th and it is still "in progress". Why is this happening?
Product Support | | CarisaS_Wenda0 -
This stuff works - but be patient
..or subtitled "This is how I did it" Managed to get clients second most important keyword to #1 on Google recently (most important is about 5 or 6, am getting there with that one) 🙂 Yes, we all know 'rankings aren't important, traffic/searches are' but you know what - when a Client sees their keyword at #1 it kinda helps getting paid. Thus, in summary, this is how I did it. I don't think there's anything earth shattering here but it might help a few out as I see so many 'where do i start' type posts. 1 - Do everything within Webmaster Tools that is humanly possible. Sitemaps, fix errors, preferred domain the lot. Spend ages here! 2 - Get your on page optimisation sorted out. Decide on THE main keyword for page, fire it into your campaign and test it. Do not give up until it's an A and you've done ALL the tips/hints (yes, even bold text on the page.) It's obvious but you have to tell a Search Engine exactly what the page is about - and give it a few hints too. 3 - Links - ahhh, good old links the bain of our lives. This site was struggling for links so i bought some - yes, the shame of it. I still think the best directories are a good starting point, spent about £400 or so. (BOTW, JoeAnt, HotVsNot etc,) Also, find low hanging fruit places for links. See where top 3 or 4 competitors are getting links from (via OSE) and get your site linked there too. These aren't always chargeable. 4 - Get rid of as many errors you can from Crawl Diagnostics (Roger) and do everything you can to ensure page(s) load as quick as possible. e.g. for images, resize them and reduce colour depth. 5 - Go over Steps 1-4 again and again AND AGAIN. I think that's about it (will add anything if i think of it), this all took about 4-5 months - not to do the SEO work but for Google (or any SE) to recognise it all, so thanks to all at SEOmoz, the blogs and this forum for all the assistance. (Just need to get Linkscape updated quicker now folks - i couldn't resist that one!)
Moz Pro | | Capote1 -
Is Open Site Explorer ignoring 301 redirects now?
I just recently saw a huge decline in the page rank of a specific page on my site. When I investigated a bit further I noticed that the drop in page rank looks like it is due to the fact that most of the links to the page come through 301 re-directs from an old page. I know you just made a change to Open Site Explorer. Did you change the way that you are treating 301 re-directs? Here is the new page: http://www.justjen.com/shop/big-sister-tshirts.htm Here is the old page: http://www.justjen.com/shop/bigsister-tshirts.htm Up until the last couple of days, the new page was showing the links from the old page in your cache, but as of today, the new page is only showing the links that go to it, not the links to the old page that is re-directed to it. If there was a change recently, was this intentional (trying to replicate the search engines better) or is it an oversight or database anomaly?
Moz Pro | | gametv0 -
BOTW links not recognized by Open Site Explorer
Hi there, I was wondering if I buy a submission to the Best of the Web directory (waiting for the new directory list promised by the seomoz team 🙂 ) but when I get to the category on BOTW website that will fit for my website, I took some links already there and put them on open site explorer to see their value, I had the surprise they are not even recognized... So I am still wondering if it is worth or not... voilà , if anybody knows if this directory still has value...
Moz Pro | | thuraminho750