Is having no robots.txt file the same as having one and allowing all agents?
-
The site I am working on currently has no robots.txt file. However, I have just uploaded a sitemap and would like to point the robots.txt file to it.
Once I upload the robots.txt file, if I allow access to all agents, is this the same as when the site had no robots.txt file at all; do I need to specify crawler access on can the robots.txt file just contain the link to the sitemap?
-
According to me a sitemap is more important than robots.txt as it help a search engine bot in effectively crawling a website. Robots.txt is generally used to request (allow: or disallow:)a crawler not to crawl and index certain section of your website containing sensitive data. This is totally upto the crawler to respect the request by not crawling and indexing that sensitive part. However, it is a general practice among webmasters world wide to have a robots.txt file for each of their sites. A common robots.txt with permission to access the entire website should look like this:
User-agent: *
Disallow:Sitemap: http://www.yoursite.com/sitemap.xml
So if you want some section (folders, directories) of your site not to be crawled by a bot then you can use a robots.txt.
Yes logically its the same like having a robots.txt file granting all the access and not having one completely. Its just a difference between like something having 'by default". Having a robots.txt file doesn't guarantee a rank boost in the SERP. Hope it helps. For more understanding please refer these resources:
Cheers
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to allow bots to crawl all but WP-content
Hello, I would like my website to remain crawlable to bots, but to block my wp content and media. Does the following robots.txt work? I worry that the * user agent may conflict with the others. User-agent: *
Technical SEO | | Tom3_15
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/ User-agent: GoogleBot
Allow: / User-agent: GoogleBot-Mobile
Allow: / User-agent: GoogleBot-Image
Allow: / User-agent: Bingbot
Allow: / User-agent: Slurp
Allow: /0 -
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
Does uploading a new disavow file wipe out the original?
Hi guys, Just struggling to get a definitive answer on this one. If say I disavow 55 domains, then upload a brand new disavow file with on 35 domains in it, does this mean the original disavow file will be overwritten and those original domains will be forgotten about? Kind regards!
Technical SEO | | WCR0 -
Best way to redirect 3 sites to 1 new one.
Hi All We currently have 3 old sites that have tones of content. Due to brand/business consolidation we have merge all 3 to produce 1 website. The new site contains all the old content from the old 3. So, I know I need to 301 redirect all the old content from the previous sites to the equivelent content on the new sites but am confused how you do this with 3 domains? One of the domains is being replaced with the new site. So I have: www.domain1.co.uk www.domain2.co.uk www.domain3.co.uk All the content for all the sites have been imported into a new site and any duplicate content issues havce been resolved. Can anyone point me in the right direction? Thanks
Technical SEO | | EclipseLegal0 -
One Page - Targeting Multiple Low Searched Keywords.
Hi, First "Question" on SEOmoz, A client has requested me to have all the traffic going to the main/home page. In total its 25 Keywords and competition is pretty low, lets say "Builder In City", all the keywords are the sames except for the citys. "Builder In London" "Builder in Birmingham" Builder in Cardiff" .. and so on. Will it be ok and do-able to target 1 homepage with 25 keywords and expect decent results.
Technical SEO | | Prestige-SEO0 -
Does RogerBot read URL wildcards in robots.txt
I believe that the Google and Bing crawlbots understand wildcards for the "disallow" URL's in robots.txt - does Roger?
Technical SEO | | AspenFasteners0 -
.htacess file format for Apache Server
Hi, My website having canonical issue for home page, I have written the .htaccess file and upload the root directory. But still I didn't see any changes in the home page. I am copying syntax which one I have written in the .htaccess file. Please review the syntax and let me know the changes. Options +FollowSymlinks RewriteEngine on #RewriteBase / re-direct index.htm to root / ### RewriteCond %{THE_REQUEST} ^./index.htm\ HTTP/ RewriteRule ^(.)index.htm$ /$1 [R=301,L] re-direct IP address to www ### re-direct non-www to www ### re-direct any parked domain to www of main domain RewriteCond %{http_host} !^www.metricstream.com$ [nc] RewriteRule ^(.*)$ http://www.metricstream.com/$1 [r=301,nc,L] Is there any specific htaccess file format for apache server? Thanks, Karthik
Technical SEO | | karthik-1755440 -
Robots.txt
My campaign hse24 (www.hse24.de) is not being crawled any more ... Do you think this can be a problem of the robots.txt? I always thought that Google and friends are interpretating the file correct, seen that he site was crawled since last week. Thanks a lot Bernd NB: Here is the robots.txt: User-Agent: * Disallow: / User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: MSNBot User-agent: Slurp User-agent: yahoo-mmcrawler User-agent: psbot Disallow: /is-bin/ Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_DisplayProductInformation-Start Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/ Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/Root%20Edition/units/HSE24/Beratung/
Technical SEO | | remino630