Google insists robots.txt is blocking... but it isn't.
-
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site.
When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt
Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section.
Bing's webmaster tools are able to read the site and sitemap just fine.
Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
-
Hi Aaron - You have a couple of solid answers here. Has your issue been resolved in GWT?
-
24 hours is a short time and probably google did not reindex or even looked at your new robot.txt
Webmaster tools is way slower than bing tools, so be patient.
As a rule of thumb, I wait at least a week with google before worrying (my 2 cents)
-
Hi Aaron,
I identify with your frustration, but want to lead my response with the caveat that I am not a developer so there may be people here with much more technical SEO expertise than me who might have a better answer.
What I do know id that Google Webmaster Tools data is not real time and can often take days to weeks to update. It could be that the reason GWT is showing something different about your robots.txt file is because it's old information that hasn't updated yet.
When I looked at your robots.txt file, I found two sitemaps, one with 2 URLs and one with 8 URLs. This is pretty tiny. Even in the old days, conventional wisdom was that it took at least 20 content pages in order for Google to take note and index the site.
Have you tried posting the URLs of your new site on Google+? I have heard that this is a great indexing tool in addition to the Fetch as Googlebot in GWT. Just a thought!
You know, there was a time when it took 6-8 weeks for a new site to get indexed. Google has definitely sped up to the point where I think we are all expecting instant results and sometimes that just doesn't happen.
I think this just might be a matter of patience. However, I am always willing to admit that I could be wrong and am interested to know what others think!
Dana
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google selecting incorrect URL as canonical: 'Duplicate, submitted URL not selected as canonical'
Hi there, A number of our URLs are being de-indexed by Google. When looking into this using Google Search Console the same message is appearing on multiple pages across our sites: 'Duplicate, submitted URL not selected as canonical' 'IndexingIndexing allowed? YesUser-declared canonical - https://www.mrisoftware.com/ie/products/real-estate-financial-software/Google-selected canonical - https://www.mrisoftware.com/uk/products/real-estate-financial-software/'Has anyone else experienced this problem?How can I get Google to select the correct, user-declared canoncial? Thanks.
Technical SEO | | nfrank0 -
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
Why don't sites using Drupal have keywords
Why don't the vast majority of sites using Drupal list keywords in the head section? Is there another convention used in Drupal that serves the same purpose for SEO? I noticed most of the Drupal info pages about keywords seem to drop off around 2010
Technical SEO | | fxarechiga0 -
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
John Mueller says don't use Schema as its not working yet but I get markup conflicts using Google Mark-up
I watched recently John Mueller's Google Webmaster Hangout [DEC 5th]. In hit he mentions to a member not to use Schema.org as it's not working quite yet but to use Google's own mark-up tool 'Structured Data Markup Helper'. Fine this I have done and one of the tags I've used is 'AUTHOR'. However if you use Google's Structured Data Testing Tool in GWMT you get an error saying the following Error: Page contains property "author" which is not part of the schema. Yet this is the tag generated by their own tool. Has anyone experienced this before? and if so what action did you take to rectify it and make it work. As it stands I'm considering just removing this tag altogether. Thanks David cqbsdbunpicv8s76dlddd1e8u4g
Technical SEO | | David-E-Carey0 -
Robots.txt best practices & tips
Hey, I was wondering if someone could give me some advice on whether I should block the robots.txt file from the average user (not from googlebot, yandex, etc)? If so, how would I go about doing this? With .htaccess I'm guessing - but not an expert. What can people do with the information in the file? Maybe someone can give me some "best practices"? (I have a wordpress based website) Thanks in advance!
Technical SEO | | JonathanRolande0 -
Robots.txt blocking site or not?
Here is the robots.txt from a client site. Am I reading this right --
Technical SEO | | 540SEO
that the robots.txt is saying to ignore the entire site, but the
#'s are saying to ignore the robots.txt command? See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file To ban all spiders from the entire site uncomment the next two lines: User-Agent: * Disallow: /0 -
Title Tag won't update
The Home page Title Tag on my Wordpress CMS Website doesn't update. It has the Thesis Theme and All in One SEO plugin. I've updated it exactly the same on the Home page, for SEO plugin, and in the Thesis Theme - but still it displays the old one.. The site was last cached about a week ago - 3X since I've been attempting to change it. The other page's Title Tags update instantly when updated, but not the Home page. I've been digging for a solution and can't seem to find an answer. What could be the problem? (www.austinfitnessgyms.com -It should be: Austin Gym | Kickboxing in Austin | Impact MMA Fitness)
Technical SEO | | OhYeahSteve0