Is it terrible to not have robots.txt ?
-
I was under the impression that you really should have a robots.txt page, and not having one is pretty bad. However, hubspot (which I'm not impressed with) does not have the capability of properly implementing one. Will this hurt the site?
-
Thank you everyone! Great stuff
-
And for reference, this Search Engine Land column from 2009 provides a pretty concise treatment of how to utilize our friend the robots exclusion protocol: A Deeper Look At Robots.txt
-
Hi Jaycie,
Google's view of the issue is that you should have a robots.txt file in order to eliminate the risk of your web host dealing with requests in an unexpected way and returning something strange.
Matt Cutts talked about robots.txt in this Webmaster Help Video last month.
Hope that helps,
Sha
-
Technically you don't need one.
However, It is so easy to put one in place (usually) and I would consider it a best practice. Like developing an application without taking notes on implementation. What happens when someone new comes along to work with it. They will ask the same question. Not having one and deliberately allowing the robots to crawl all are two inherently different things.
-
It won't hurt the site. You only need one if you want to disallow parts of your site to search engines, or disallow different search bots. If you don't have any pages or directories to disallow, I wouldn't worry about it.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content: using the robots meta tag in conjunction with the canonical tag?
We have a WordPress instance on an Apache subdomain (let's say it's blog.website.com) alongside our main website, which is built in Angular. The tech team is using Akamai to do URL rewrites so that the blog posts appear under the main domain (website.com/more-keywords/here). However, due to the way they configured the WordPress install, they can't do a wildcard redirect under htaccess to force all the subdomain URLs to appear as subdirectories, so as you might have guessed, we're dealing with duplicate content issues. They could in theory do manual 301s for each blog post, but that's laborious and a real hassle given our IT structure (we're a financial services firm, so lots of bureaucracy and regulation). In addition, due to internal limitations (they seem mostly political in nature), a robots.txt file is out of the question. I'm thinking the next best alternative is the combined use of the robots meta tag (no index, follow) alongside the canonical tag to try to point the bot to the subdirectory URLs. I don't think this would be unethical use of either feature, but I'm trying to figure out if the two would conflict in some way? Or maybe there's a better approach with which we're unfamiliar or that we haven't considered?
Technical SEO | | prasadpathapati0 -
Google Indexing Development Site Despite Robots.txt Block
Hi, A development site that has been set-up has the following Robots.txt file: User-agent: * Disallow: / In an attempt to block Google indexing the site, however this isn't the case and the development site has since been indexed. Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | CarlWint0 -
Exclude root url in robots.txt ?
Hi, I have the following setup: www.example.com/nl
Technical SEO | | mikehenze
www.example.com/de
www.example.com/uk
etc
www.example.com is 301'ed to www.example.com/nl But now www.example.com is ranking instead of www.example.com/nl
Should is block www.example.com in robots.txt so only the subfolders are being ranked?
Or will i lose my ranking by doing this.0 -
Robots.txt
Google Webmaster Tools say our website's have low-quality pages, so we have created a robots.txt file and listed all URL’s that we want to remove from Google index. Is this enough for the solve problem?
Technical SEO | | iskq0 -
Can I Disallow Faceted Nav URLs - Robots.txt
I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls. So disallow: /category.html/? /category2.html/? /category3.html/*? To prevent the price faceted url from being cached: /category.html?price=1%2C1000
Technical SEO | | tylerfraser
and
/category.html?price=1%2C1000&product_material=88 Thanks!0 -
Confused about robots.txt
There is a lot of conflicting and/or unclear information about robots.txt out there. Somehow, I can't make out what's the best way to use robots even after visiting the official robots website. For example I have the following format for my robots. User-agent: * Disallow: javascript.js Disallow: /images/ Disallow: /embedconfig Disallow: /playerconfig Disallow: /spotlightmedia Disallow: /EventVideos Disallow: /playEpisode Allow: / Sitemap: http://www.example.tv/sitemapindex.xml Sitemap: http://www.example.tv/sitemapindex-videos.xml Sitemap: http://www.example.tv/news-sitemap.xml Is this correct and/or recommended? If so, then how come I see a list of over 200 or so links blocked by robots when Im checking out Google Webmaster Tools! Help someone, anyone! Can't seem to understand this robotic business! Regards,
Technical SEO | | Netpace0 -
SeoMoz robot is not able to crawl my website.
Hi, SeoMoz robot crawls only two web pages of my website. I contacts seomoz team and they told me that the problem is because of Javascript use. What is the solution to this? Should I contact my webdesign company and ask them to remove Javascript code?
Technical SEO | | ashish2110 -
Robots.txt Syntax
Does the order of the robots.txt syntax matter in SEO? For example (are there potential problems with this format): User-agent: * Sitemap: Disallow: /form.htm Allow: / Disallow: /cgnet_directory
Technical SEO | | RodrigoStockebrand0