Block bad crawlers
-
Hi! how are you?
I've been working on some of my sites, and noticed that i'm getting lots of crawls by search engines that i'm not intereted in ranking well.
My question is the following: do you have a list of 'bad behaved' search engines that take lots of bandwidth and don´t send much/good traffic?
If so, do you know how to block them using robots.txt?
Thanks for the help!
Best wishes,
Ariel
-
Hey Ariel,
Here's a couple lists of bots that some people are blocking - you should probably review your server data to see which bots are visiting you that you want to block:
In addition to the moz resource Chris referenced, here are a couple more pages that might be useful for you:
- http://stackoverflow.com/questions/10793906/how-to-allow-known-web-crawlers-and-block-spammers-and-harmful-robots-from-scann
- http://www.distilled.net/u/robots-txt/
Good luck!
-
Chris gives a good answer, but is it really a problem, bandwidth is very cheap these days, in fact here in Australia most accounts are unlimited,
I Host with Microsoft Azure and bandwidth is very cheap.
-
Ariel, you could start with the list shown here and tailor it to fit your needs if you're having problems with others: http://www.webmasterworld.com/search_engine_spiders/4579553.htm. There's info there on using robots.txt to block them and you should also read this for info on using robots.txt file: Robots.txt and Meta Robots - SEO Best Practices - Moz
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can hotlinking images from multiple sites be bad for SEO?
Hi, There's a very similar question already being discussed here, but it deals with hotlinking from a single site that is owned by the same person. I'm interested whether hotlinking images from multiple sites can be bad for SEO. The issue is that one of our bloggers has been hotlinking all the images he uses, sometimes there are 3 or 4 images per blog from different domains. We know that hotlinking is frowned upon, but can it affect us in the SERPs? Thanks, James
Technical SEO | | OptiBacUK0 -
Site blocked by robots.txt and 301 redirected still in SERPs
I have a vanity URL domain that 301 redirects to my main site. That domain does have a robots.txt to disallow the entire site as well. However, for a branded enough search that vanity domain still shows up in SERPs and has the new Google message of: A description for this result is not available because of this site's robots.txt I get why the message is there - that's not my , my question is shouldn't a 301 redirect trump this domain showing in SERPs, ever? Client isn't happy about it showing at all. How can I get the vanity domain out of the SERPs? THANKS in advance!
Technical SEO | | VMLYRDiscoverability0 -
Temporarily suspend Googlebot without blocking users
We'll soon be launching a redesign, on a new platform, migrating millions of pages to new URLs. How can I tell Google (and other crawlers) to temporarily (a day or two) ignore my site? We're hoping to buy ourselves a small bit of time to verify redirects and live functionality before allowing Google to crawl and index the new architecture. GWT's recommendation is to 503 all pages - including robots.txt, but that also makes the site invisible to real site visitors, resulting in significant business loss. Bad answer. I've heard some recommendations to disallow all user agents in robots.txt. Any answer that puts the millions of pages we already have indexed at risk is also a bad answer. Thanks
Technical SEO | | lzhao0 -
Would having the same paragraph on every product page be bad?
I am trying to figure out if having the same paragraph on every product would be a bad thing. I know it would be bad to have the same description on every product, but this isn't a description it is a helpful paragraph stating this: Having trouble finding the wheelchair part you need? Please call us at 1-800-328-5343 or fill out the (Link)Wheelchair Parts Request Form(Link). One of our friendly customer service representatives will be happy to help you. Or would it be best to just have the "wheelchair parts request form" Link on every page Or would it be best to have neither and try putting that in a higher category making it on one page instead of every product page?
Technical SEO | | Mike.Bean0 -
Prevent Google Web Preview bot from seeing pop-up,m bad for SEO?
Hey guys, On our website, we have a lightbox pop-up showing an external page with an e-mail newsletter signup form. It it shown to some 5% of our visitors and works with a cookie to prevent the popup from showing at each visit. Recently, I saw the popup displayed in the Google SERP instant preview, for every page. The preview looks messed up. We could prevent the popup to be shown to the google web preview bot by blocking this user agent. Question is: Will it hurt our SEO? Because we show the web preview bot (not the crawl bot) something different than what a visitor may see BQrS7 BQrS7.jpg
Technical SEO | | Webprint0 -
Micro formats to block HTML text portions of pages
I have a client that wants to use micro formatting to keep a portion of their page (the disclaimer) from being read by the search engines. They want to do this because it will help with their keyword density on the rest of the page and block the “bad keywords” that come from their legally required disclaimer. We have suggested alternate methods to resolve this problem, but they do not want to implement those, they just want a POV from us explaining how this micro formatting process will work. And that’s where the problem is. I’ve never heard of this use case and can’t seem to find anyone who has. I'm posting the question to the Moz Community to see if anyone knows how microformats can keep copy from being crawled by the bots. Please include any links to sites that you know that are using micro formatting in this way. Have you implemented it and seen results? Do you know of a website that is using it now? We're looking for use cases please!
Technical SEO | | Merkle-Impaqt0 -
Seeing non-www in yahoo results - good or bad?
My site ranks for both domain versions but more non-www than www - Should I make it one or the other? How do I tell Yahoo to just choose one? Ehh?
Technical SEO | | DavidS-2820610 -
Javascript bad for SEO?
If we utilize javascript to pull information from a database to display on a site, is that bad for SEO? Can search engines still see the data?
Technical SEO | | nicole.healthline0