Blocking Affiliate Links via robots.txt

Mark_Ginsberg

Hi,

I work with a client who has a large affiliate network pointing to their domain which is a large part of their inbound marketing strategy. All of these links point to a subdomain of affiliates.example.com, which then redirects the links through a 301 redirect to the relevant target page for the link. These links have been showing up in Webmaster Tools as top linking domains and also in the latest downloaded links reports. To follow guidelines and ensure that these links aren't counted by Google for either positive or negative impact on the site, we have added a block on the robots.txt of the affiliates.example.com subdomain, blocking search engines from crawling the full subddomain. The robots.txt file is the following code:

User-agent: *

Disallow: /

We have authenticated the subdomain with Google Webmaster Tools and made certain that Google can reach and read the robots.txt file. We know they are being blocked from reading the affiliates subdomain. However, we added this affiliates subdomain block a few weeks ago to the robots.txt, but links are still showing up in the latest downloads report as first being discovered after we added the block. It's been a few weeks already, and we want to make sure that the block was implemented properly and that these links aren't being used to negatively impact the site. Any suggestions or clarification would be helpful - if the subdomain is being blocked for the search engines, why are the search engines following the links and reporting them in the www.example.com subdomain GWMT account as latest links. And if the block is implemented properly, will the total number of links pointing to our site as reported in the links to your site section be reduced, or does this not have an impact on that figure?From a development standpoint, it's a much easier fix for us to adjust the robots.txt file than to change the affiliate linking connection from a 301 to a 302, which is why we decided to go with this option.Any help you can offer will be greatly appreciated.Thanks,Mark

FedeEinhorn

I think you did the right thing. Engines will take a while until they re-crawl your robots.txt and actually following what you commanded.

Extra steps I would take:

302 the redirect, probably is just a line of code doing the redirect after setting some cookies or session variables.
Try to edit the affiliate codes to work with Javascript instead of naked URLs ( could be something like <ins class="affiliate">that is later switched to a text link or banner using JS). This will not only allow you to set a nofollow for those links, but you could be able to remove/block specific affiliates or pages where you don't want your links/banners.</ins>

PS: have you tried fetching your robots.txt from Google WT (Crawl -> Blocked URLs) to see when was it downloaded and id the contents are ok?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Blocking Affiliate Links via robots.txt

Browse Questions

Explore more categories

Related Questions

Crawl solutions for landing pages that don't contain a robots.txt file?

Does CAPTCHA Block Crawlbots?

Do I need a separate robots.txt file for my shop subdomain?

How to use robots.txt to block areas on page?

Restricted by robots.txt does this cause problems?

Client accidently blocked entire site with robots.txt for a week

What should I do about links coming in that are from link farm type sites?

Do Link wheel works?