How can I make Google Webmaster Tools see the robots.txt file when I am doing a .htacces redirec?
-
We are moving a site to a new domain. I have setup an .htaccess file and it is working fine. My problem is that Google Webmaster tools now says it cannot access the robots.txt file on the old site. How can I make it still see the robots.txt file when the .htaccess is doing a full site redirect?
.htaccess currently has:
Options +FollowSymLinks -MultiViews
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?michaelswilderhr.com$ [NC]
RewriteRule ^ http://www.s2esolutions.com/ [R=301,L]Google webmaster tools is reporting:
Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.
-
Possible Solitions for your problem:
.htaccess authentication blocking robots.txt
301 redirect. How to make an exception for the robots.txt
http://forum.cs-cart.com/topic/23747-301-redirect-how-to-make-an-exception-for-the-robotstxt/
1. Canonical robots.txt
http://digwp.com/2011/03/htaccess-wordpress-seo-security/
General .htaccess tutorials: http://httpd.apache.org/docs/2.0/howto/htaccess.htmlhttp://httpd.apache.org/docs/2.0/misc/rewriteguide.html
-
Thank you that seems to be working.
-
You could add an exception to the htaccess to allow the robots to be loaded. You would do this with by adding another condition. I'd use something like:
<code>Options +FollowSymLinks -MultiViews RewriteEngine on RewriteCond %{REQUEST_URI} !/robots.txt RewriteCond %{HTTP_HOST} ^(www\.)?michaelswilderhr\.com$ [NC] RewriteRule ^ [http://www.s2esolutions.com/](http://www.s2esolutions.com/) [R=301,L]</code>
Disclaimer: I am lucky enough to have people at work who check these things. This hasn't been checked! Use at your own discretion
However I'll admit that I've never used this. I just stick the 301 in and it all seems to work out fine. Probably done it on hundreds of domains over the years.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search console says 'sitemap is blocked by robots?
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: *
Technical SEO | | Extima-Christian
Disallow: Sitemap: http://www.website.com/sitemap_index.xml It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?1 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
Webmaster tools...URL Errors
Hi mozzers, Quick question. Whats the best thing to do about URL errors in webmaster tools. They are all 404s that point from external sites. Many of them are junk spam sites. Should I mark them as "fixed" or just leave them. I'm hoping google is aware it's out of my control if spam sites want to link to 404s on my site. Peter
Technical SEO | | PeterM220 -
Robots.txt
Hello Everyone, The problem I'm having is not knowing where to have the robots.txt file on our server. We have our main domain (company.com) with a robots.txt file in the root of the site, but we also have our blog (company.com/blog) where were trying to disallow certain directories from being crawled for SEO purposes... Would having the blog in the sub-directory still need its own robots.txt? or can I reference the directories i don't want crawled within the blog using the root robots.txt file? Thanks for your insight on this matter.
Technical SEO | | BailHotline0 -
301 mistake in Google Webmaster Tools?
Google webmaster tools has a warning for our site map saying that this url (and a couple of others) have a 301 redirect in them. http://www.aquinasandmore.com/catholic-gifts/Immaculate-Heart-of-Mary-Bookmark/sku/59682 I've checked the link and don't see that it actually is redirecting. Any thoughts on why this is popping up?
Technical SEO | | IanTheScot0 -
How can I improve my google places ranking?
I am currently registered with google places for 'video conferencing in Melbourne australia' however I don't show up on page 1 of the places sesrch results for this search term. How can I improve it. I do note that my office address is in a residential area and not Melbourne CBD. Thanks Dan
Technical SEO | | dantmurphy0 -
Robots.txt and canonical tag
In the SEOmoz post - http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts, it's being said - If you have a robots.txt disallow in place for a page, the canonical tag will never be seen. Does it so happen that if a page is disallowed by robots.txt, spiders DO NOT read the html code ?
Technical SEO | | seoug_20050 -
Robots.txt and robots meta
I have an odd situation. I have a CMS that has a global robots.txt which has the generic User-Agent: *
Technical SEO | | Highland
Allow: / I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?0