Is having no robots.txt file the same as having one and allowing all agents?
-
The site I am working on currently has no robots.txt file. However, I have just uploaded a sitemap and would like to point the robots.txt file to it.
Once I upload the robots.txt file, if I allow access to all agents, is this the same as when the site had no robots.txt file at all; do I need to specify crawler access on can the robots.txt file just contain the link to the sitemap?
-
According to me a sitemap is more important than robots.txt as it help a search engine bot in effectively crawling a website. Robots.txt is generally used to request (allow: or disallow:)a crawler not to crawl and index certain section of your website containing sensitive data. This is totally upto the crawler to respect the request by not crawling and indexing that sensitive part. However, it is a general practice among webmasters world wide to have a robots.txt file for each of their sites. A common robots.txt with permission to access the entire website should look like this:
User-agent: *
Disallow:Sitemap: http://www.yoursite.com/sitemap.xml
So if you want some section (folders, directories) of your site not to be crawled by a bot then you can use a robots.txt.
Yes logically its the same like having a robots.txt file granting all the access and not having one completely. Its just a difference between like something having 'by default". Having a robots.txt file doesn't guarantee a rank boost in the SERP. Hope it helps. For more understanding please refer these resources:
Cheers
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did our highly ranked keyword drop to 51+ for just one week?
One of our most important keywords (ranked 5) dropped to 51+ one week and then went back to 5 around the time we launched a new site. Why did that happen?
Technical SEO | | virtuance_photography0 -
Robot.txt : How to block a specific file type in several subdirectories ?
Hello everyone ! I need help setting up a robot.txt. I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site. Block files of a specific file type (for example, .gif) | Disallow: /*.gif$ 2 questions : Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ? Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$ Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files. Let's say I want to block pdf files in all these 3 directories /fileadmin/directory1 /fileadmin/directory1/sub1 /fileadmin/directory1/sub1/pdf Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple : Disallow: /fileadmin/directory1*/ Many thanks in advance for any insight you may have.
Technical SEO | | LabeliumUSA0 -
How to fix: Attribute name not allowed on element meta at this point.
Hello, HTML validator brings "Attribute name not allowed on element meta at this point" for all my meta tags. Yet, as I understand, it is essential to keep meta-description for SEO, for example. I read a couple of articles on how to fix that and one of them suggests considering HTML5 custom data attribute instead of name: Do you think I should try to validate my page? And instead of ? I will appreciate your advise very much!
Technical SEO | | kirupa0 -
Site Migration from One Dev. and Server to Another Dev. and Server
Hi Mozzers! I've got a client that is in the early stages of moving the development of their site to another company and therefore, a new server. The site is very large and the migration will take place over 18 months. In the beginning, smaller chunks of the site will be moved, and as that process gets dialed in, larger portions will migrate. It was brought to our attention today that they (on either side of development) have not yet worked out the logistics of keeping the domain and URL structure consistent throughout the migration. The initial proposal was that they publish newly migrated pages to a subdomain, which we obviously want to steer away from. I'm now on a mission to find a solution that will make everyone happy; client, old dev, new dev, and us (as the SEO partner). Does anyone have experience in managing SEO through a migration such as this?
Technical SEO | | LoganRay0 -
Robots.txt
Hello, My client has a robots.txt file which says this: User-agent: * Crawl-delay: 2 I put it through a robots checker which said that it must have a **disallow command**. So should it say this: User-agent: * Disallow: crawl-delay: 2 What effect (if any) would not having a disallow command make? Thanks
Technical SEO | | AL123al0 -
Black listed or not, struggling on this one.
I have a client who said they are black listed and they do not come up for any search query other than their name. I have done what I would expect to find the issues, like hurtful backlinks, poor coding etc however the code is fine, yes backlinks are a little slim. They have also said Penguin hit them hard last year. I am confused with this one as I have worked with clients who got hit by penguin and they improved but this particular client has not. http://www.specialistpaintsonline.co.uk is the website, and if anyone can shed some light as I may be missing something head on. regards
Technical SEO | | Shuffled0 -
Problem with indexed files before domain was purchased
Hello everybody, We bought this domain a few months back and we're trying to figure out how to get rid of indexed pages that (i assume) existed before we bought this domain - the domain was registered in 2001 and had a few owners. I attached 3 files from my webmasters tools, can anyone tell me how to get rid of those "pages" and more important: aren't this kind of "pages" result of some kind of "sabotage"? Looking forward to hearing your thoughts on this. Thank you, Alex Picture-5.png Picture-6.png Picture-7.png
Technical SEO | | pwpaneuro0 -
Trying to reduce pages crawled to within 10K limit via robots.txt
Our site has far too many pages for our 10K page PRO account which are not SEO worthy. In fact, only about 2000 pages qualify for SEO value. Limitations of the store software only permit me to use robots.txt to sculpt the rogerbot site crawl. However, I am having trouble getting this to work. Our biggest problem is the 35K individual product pages and the related shopping cart links (at least another 35K); these aren't needed as they duplicate the SEO-worthy content in the product category pages. The signature of a product page is that it is contained within a folder ending in -p. So I made the following addition to robots.txt: User-agent: rogerbot
Technical SEO | | AspenFasteners
Disallow: /-p/ However, the latest crawl results show the 10K limit is still being exceeded. I went to Crawl Diagnostics and clicked on Export Latest Crawl to CSV. To my dismay I saw the report was overflowing with product page links: e.g. www.aspenfasteners.com/3-Star-tm-Bulbing-Type-Blind-Rivets-Anodized-p/rv006-316x039354-coan.htm The value for the column "Search Engine blocked by robots.txt" = FALSE; does this mean blocked for all search engines? Then it's correct. If it means "blocked for rogerbot? Then it shouldn't even be in the report, as the report seems to only contain 10K pages. Any thoughts or hints on trying to attain my goal would REALLY be appreciated, I've been trying for weeks now. Honestly - virtual beers for everyone! Carlo0