Block an entire subdomain with robots.txt?
-
Is it possible to block an entire subdomain with robots.txt?
I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
-
Awesome! That did the trick -- thanks for your help. The site is no longer listed
-
Fact is, the robots file alone will never work (the link has a good explanation why - short form: all it does is stop the bots from indexing again).
Best to request removal then wait a few days.
-
Yeah. As of yet, the site has not been de-indexed. We placed the conditional rule in htaccess and are getting different robots.txt files for the domain and subdomain -- so that works. But I've never done this before so I don't know how long it's supposed to take?
I'll try to verify via Webmaster Tools to speed up the process. Thanks
-
You should do a remove request in Google Webmaster Tools. You have to first verify the sub-domain then request the removal.
See this post on why the robots file alone won't work...
http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
Awesome. We used your second idea and so far it looks like it is working exactly how we want. Thanks for the idea.
Will report back to confirm that the subdomain has been de-indexed.
-
Option 1 could come with a small performance hit if you have a lot of txt files being used on the server.
There shouldn't be any negative side effects to option 2 if the rewrite is clean (IE not accidently a redirect) and the content of the two files are robots compliant.
Good luck
-
Thanks for the suggestion. I'll definitely have to do a bit more research into this one to make sure that it doesn't have any negative side effects before implementation
-
We have a plugin right now that places canonical tags, but unfortunately, the canonical for the subdomain points to the subdomain. I'll look around to see if I can tweak the settings
-
Sounds like (from other discussions) you may be stuck requiring a dynamic robot.txt file which detects what domain the bot is on and changes the content accordingly. This means the server has to run all .txt file as (I presume) PHP.
Or, you could conditionally rewrite the /robot.txt URL to a new file according to sub-domain
RewriteEngine on
RewriteCond %{HTTP_HOST} ^subdomain.website.com$
RewriteRule ^robotx.txt$ robots-subdomain.txtThen add:
User-agent: *
Disallow: /to the robots-subdomain.txt file
(untested)
-
Placing canonical tags isn't an option? Detect that the page is being viewed through the subdomain, and if so, write the canonical tag on the page back to the root domain?
Or, just place a canonical tag on every page pointing back to the root domain (so the subdomain and root domain pages would both have them). Apparently, it's ok to have a canonical tag on a page pointing to itself. I haven't tried this, but if Matt Cutts says it's ok...
-
Hey Ryan,
I wasn't directly involved with the decision to create the subdomain, but I'm told that it is necessary to create in order to bypass certain elements that were affecting the root domain.
Nevertheless, it is a blog and the users now need to login to the subdomain in order to access the Wordpress backend to bypass those elements. Traffic for the site still goes to the root domain.
-
They both point to the same location on the server? So there's not a different folder for the subdomain?
If that's the case then I suggest adding a rule to your htaccess file to 301 the subdomain back to the main domain in exactly the same way people redirect from non-www to www or vice-versa. However, you should ask why the server is configured to have a duplicate subdomain? You might just edit your apache settings to get rid of that subdomain (usually done through a cpanel interface).
Here is what your htaccess might look like:
<ifmodule mod_rewrite.c="">RewriteEngine on
# Redirect non-www to wwww
RewriteCond %{HTTP_HOST} !^www.mydomain.org [NC]
RewriteRule ^(.*)$ http://www.mydomain.org/$1 [R=301,L]</ifmodule> -
Not to me LOL I think you'll need someone with a bit more expertise in this area than I to assist in this case. Kyle, I'm sorry I couldn't offer more assistance... but I don't want to tell you something if I'm not 100% sure. I suspect one of the many bright SEOmozer's will quickly come to the rescue on this one.
Andy
-
Hey Andy,
Herein lies the problem. Since the domain and subdomain point to the exact same place, they both utilize the same robots.txt file.
Does that make sense?
-
Hi Kyle Yes, you can block an entire subdomain via robots.txt, however you'll need to create a robots.txt file and place it in the root of the subdomain, then add the code to direct the bots to stay away from the entire subdomain's content.
User-agent: *
Disallow: /hope this helps
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Choosing root domain for a subdomain
We own two root domains in the .edu space. According to Open Site Explorer, one has a domain authority of 76, while the other has a DA of 94. We operate a collection of degree microsites as subdomains of the lower-ranking root domain, e.g. www.degreename.domain76.edu. All other things being equal, would these sites benefit if we migrated them to www.degreename.domain94.edu? The question seems to hinge on whether subdomains inherit any of the root domain's authority, and the answers I have seen to that question are "sometimes" and "maybe". Lastly, as an alternative, would we realize greater SEO improvements by moving the degrees to a directory structure under domain94, i.e. www.domain94.edu/degrees/degree-name? Thank you for your help!
Intermediate & Advanced SEO | | UWPCE0 -
How to properly 404 pages from a subdomain
SO I am working on a site that had a subdomain that attracted a lot of spammy links. I researched the backlinks to this subdomain, and there were no beneficial links at all. I am thinking the best thing is to 404 this subdomain. What is the best way to do this? Should I just edit the DNS settings so that this subdomain does not point to the root domain? Or is there something that should be done in webmaster tools? Thanks in advance!
Intermediate & Advanced SEO | | evan890 -
Should I disallow via robots.txt for my sub folder country TLD's?
Hello, My website is in default English and Spanish as a sub folder TLD. Because of my Joomla platform, Google is listing hundreds of soft 404 links of French, Chinese, German etc. sub TLD's. Again, i never created these country sub folder url's, but Google is crawling them. Is it best to just "Disallow" these sub folder TLD's like the example below, then "mark as fixed" in my crawl errors section in Google Webmaster tools?: User-agent: * Disallow: /de/ Disallow: /fr/ Disallow: /cn/ Thank you, Shawn
Intermediate & Advanced SEO | | Shawn1240 -
Changing a subdomain to a full domain to rank for a keyword
We have been attempting to get our blogsite to rank for our business name(Instabill). We are now considering changing the url from blog.instabill.com to something like instabillblog.com. I have following concerns about the change; Will changing the domain really be that helpful (i.e. will the change get our blog on page one for the term instabill) We have over 350 pages of content on our blog. Will changing the domain have possible negative effects ( I was thinking of using url updater in webmaster tools and creating a permanent 301 redirect from the older url to the new) Having never changed a url for a site with this much content and seo value for my company I would like to know the following from someone who has made mistakes here before; what not to do what steps you would take to make the transition easier Any help here will be greatly appreciated. cheers, Instabill
Intermediate & Advanced SEO | | Instabill0 -
Archiving a festival website - subdomain or directory?
Hi guys I look after a festival website whose program changes year in and year out. There are a handful of mainstay events in the festival which remain each year, but there are a bunch of other events which change each year around the mainstay programming.This often results in us redoing the website each year (a frustrating experience indeed!) We don't archive our past festivals online, but I'd like to start doing so for a number of reasons 1. These past festivals have historical value - they happened, and they contribute to telling the story of the festival over the years. They can also be used as useful windows into the upcoming festival. 2. The old events (while no longer running) often get many social shares, high quality links and in some instances still drive traffic. We try out best to 301 redirect these high value pages to the new festival website, but it's not always possible to find a similar alternative (so these redirects often go to the homepage) Anyway, I've noticed some festivals archive their content into a subdirectory - i.e. www.event.com/2012 However, I'm thinking it would actually be easier for my team to archive via a subdomain like 2012.event.com - and always use the www.event.com URL for the current year's event. I'm thinking universally redirecting the content would be easier, as would cloning the site / database etc. My question is - is one approach (i.e. directory vs. subdomain) better than the other? Do I need to be mindful of using a subdomain for archival purposes? Hope this all makes sense. Many thanks!
Intermediate & Advanced SEO | | cos20300 -
301 entire site
Is there a good 301 code snippet to change just the root domain but keep the ending extensions? I just bid on a domain that I think would be much better for me moving forward, but do not want to have to try going through thousands of pages to do their 301 individually My site is almost 4 yrs old. Well established and has a large fanbase. Several of our social networks are under the name of the new branded domain, hence part of the desire to switch.
Intermediate & Advanced SEO | | Atomicx0 -
What's the best practise for adding a blog to your site post panda? subdomain or subdirectory???
Should i use a subdomain or a subdirectory? i was going to use a subdirectory however i have been reading a lot of articles on the use of subdomains post panda and the advantages of using them instead of using subdirectories. Thanks Ari
Intermediate & Advanced SEO | | dublinbet0 -
We are a web hosting company and some of our best links are from our own customers, on the same IP, but different Class C blocks..
We are a web hosting company and some of our best links are from our own customers, on the same IP same IP, but different Class C blocks. How do search engines treat the uniqie scenario of web hosting companies and linking?
Intermediate & Advanced SEO | | FirePowered0