Robots.txt question
-
Hello,
What does the following command mean -
User-agent: * Allow: /
Does it mean that we are blocking all spiders ? Is Allow supported in robots.txt ?
Thanks
-
It's a good idea to have an xml site map and make sure the search engines know where it is. It's part of the protocol that they will look in the robots.txt file for the location for your sitemap.
-
I was assuming that by including / after allow, we are blocking the spiders and also thought that allow is not supported by search engines.
Thanks for clarifications. A better approach would be
User-Agent: * Allow:
right ?
The best one of course is
**User-agent: * Disallow:**
-
That's not really necessary unless there URLs or directories you're disallowing after the allow in your robots.txt. Allow is a directive supported by major search engines, but search engines assume they're allowed to crawl everything they find unless you disallow it specifically in your robots.txt.
The following is universally accepted by bots and essentially means the same thing as what I think you're trying to say, allowing bots to crawl everything:
User-agent: * Disallow:
There's a sample use of the Allow directive on the wikipedia robots.txt page here.
-
There's more information about robots.txt from SEOmoz at http://www.seomoz.org/learn-seo/robotstxt
SEOmoz and the robots.txt site suggest the following for allowing robots to see everying and list your sitemap:
User-agent: *
Disallow:Sitemap: http://www.example.com/none-standard-location/sitemap.xml
-
Any particular reason for doing so ?
-
That robots txt should be fine.
But you should also add your XML sitemap to the robots.txt file, example:
User-Agent: * Allow: / Sitemap: http://www.website.com/sitemap.xml
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
X-robots tag causing no index issues
I have an interesting problem with a site which has an x-robot tag blocking the site from being indexed, the site is in Wordpress, there are no issues with the robots.txt or at the page level, I cant find the noindex anywhere. I removed the SEO plug-in which was there and installed Yoast but it made no difference. this is the url: https://www.cotswoldflatroofing.com/ Its coming up with a HTTP error: x-robots tag noindex, nofollow, noarchive
Technical SEO | | Donsimong0 -
Question about Unpredictability with the Knowledge panel showing up for the same search
The people in my client's office get different results when they search for their company name in Google. For example one person ALWAYS gets the right rail knowledge panel with full details about the company while her boss NEVER sees it. They are both on desktop search. Rosemary
Technical SEO | | RosemaryB0 -
Site Penalized - 301 Redirect Question
Hello, We have a website that was penalized roughly two years by Google for "Unnatural Links"... We are experiencing a lot of problems with this site, completely unrelated to the penalty or SERPS, and we're debating doing a 301 Re-direct to another site we own that is totally clean and has no "Unnatural Links". If we do a 301 from the penalized site to our alternative website, will there be any cross-contamination? Will the penalty carry over to our other site? Please let me know what you guys think. Thanks
Technical SEO | | Prime850 -
Blocked URL's by robots.txt
In Google Webmaster Tools shows me 10,936 Blocked URL's by robots.txt and it is very strange when you go to the "Index Status" section where shows that since April 2012 robots.txt blocked many URL's. You can see more precise on the image attached (chart WMT) I can not explain why I have blocked URL's ? because I have nothing in robots.txt.
Technical SEO | | meralucian37
My robots.txt is like this: User-agent: * I thought I was penalized by Penguin in April 2012 because constantly i'am losing visitors now reaching over 40%. It may be a different penalty? Any help is welcome because i'm already so saturated. Mera robotstxt.jpg0 -
Blocked by robots
my client GWT has a number of notices for "blocked by meta-robots" - these are all either blog posts/categories/or tags his former seo told him this: "We've activated following settings: Use noindex for Categories Use noindex for Archives Use noindex for Tag Archives to reduce keyword stuffing & duplicate post tags
Technical SEO | | Ezpro9
Disabling all 3 noindex settings above may remove google blocks but also will send too many similar tags, post archives/category. " is this guy correct? what would be the problem with indexing these? am i correct in thinking they should be indexed? thanks0 -
Blocked by meta-robots but there is no robots file
OK, I'm a little frustred here. I've waited a week for the next weekly index to take place after changing the privacy setting in a wordpress website so Google can index, but I still got the same problem. Blocked by meta-robots, no index, no follow. But I do not see a robot file anywhere and the privacy setting in this Wordpress site is set to allow search engines to index this site. Website is www.marketalert.ca What am I missing here? Why can't I index the rest of the website and is there a faster way to test this rather than wait another week just to find out it didn't work again?
Technical SEO | | Twinbytes0 -
Can I Disallow Faceted Nav URLs - Robots.txt
I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls. So disallow: /category.html/? /category2.html/? /category3.html/*? To prevent the price faceted url from being cached: /category.html?price=1%2C1000
Technical SEO | | tylerfraser
and
/category.html?price=1%2C1000&product_material=88 Thanks!0 -
Weird Indexing Question
Google has indexed mysite.com/ and mysitem.com/\/ (no idea why). If you click on the /%5C? URL it takes you to mysite.com//. I have a rel=canonical tag on it that goes to mysite.com/ but I was wondering if there was another way to correct the issue.
Technical SEO | | BryanPhelps-BigLeapWeb0