Restricted by robots.txt does this cause problems?
-
I have restricted around 1,500 links which are links to retailers website and links that affiliate links accorsing to webmaster tools
Is this the right approach as I thought it would affect the link juice? or should I take the no follow out of the restricted by robots.txt file
-
Hello Ocelot,
I am assuming you have a site that has affiliate links and you want to keep Google from crawling those affiliate links. If I am wrong, please let me know. Going forward with that assumption then...
That is one way to do it. So perhaps you first send all of those links through a redirect via a folder called /out/ or /links/ or whatever, and you have blocked that folder in the robots.txt file. Correct? If so, this is how many affiliate sites handle the situation.
I would not rely on rel nofollow alone, though I would use that in addition to the robots.txt block.
There are many other ways to handle this. For instance, you could make all affilaite links javascript links instead of href links. Then you could put the javascript into a folder called /js/ or something like that, and block that in the robots.txt file. This works less and less now that Google Preview Bot seems to be ignoring the disallow statement in those situations.
You could make it all the same URL with a unique identifyer of some sort that tells your database where to redirect the click. For example:
www.yoursite.com/outlink/mylink#123
or
www.yoursite.com/mylink?link-id=123
In which case you could then block /mylink in the robots.txt file and tell Google to ignore the link-ID parameter via Webmaster Tools.
As you can see, there is more than one way to skin this cat. The problem is always going to be doing it without looking like you're trying to "fool" Google - because they WILL catch up with any tactic like that eventually.
Good luck!
Everett
-
From a coding perspective, applying the nofollow to the links is the best way to go.
With the robots.txt file, only the top tier search engines respect the information contained within, so lesser known bots or spammers might check your robots.txt file to see what you don't want listed, and that info will give them a starting point to look deeper into your site.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Faq problem in wordpress website
hi hi i add script faq in my site but doest show in serp and add moudle ithink may be problem in version wordpress or theme not stable ... my site url is https://giftmax.ir example when i search" خرید گیفت کارت" my competitor show faq script but my site doesnt show please help to resolve this problem thanks moz
Technical SEO | | ahmad21aga0 -
Clarification regarding robots.txt protocol
Hi,
Technical SEO | | nlogix
I have a website , and having 1000 above url and all the url already got indexed in Google . Now am going to stop all the available services in my website and removed all the landing pages from website. Now only home page available . So i need to remove all the indexed urls from Google . I have already used robots txt protocol for removing url. i guess it is not a good method for adding bulk amount of urls (nearly 1000) in robots.txt . So just wanted to know is there any other method for removing indexed urls.
Please advice.0 -
Problem with Wordpress RSS feed and Feedburner
Just discovered a problem with my company site's RSS feed. I'm a bit embarrassed to ask, but I thought someone in the community might have encountered this -- and I cannot figure it out for the life of me! We had redirected our Wordpress feed to Feedburner. We publish at least once per week, but no posts after March 18 are in the feed: http://feeds.feedburner.com/TheClineGroup The standard (Wordpress) RSS feed page does not load: http://theclinegroup.com/feed/ Of course, I deactivated all plug-ins to see if one of them was the issue, but the problem(s) still existed. Thanks so much for any assistance!
Technical SEO | | SamuelScott0 -
Can backlinks from advertising cause a traffic drop?
Hi, I recently noticed that our organic traffic has started to drop and maybe coincidently our adwords traffic has increased. I was asked to investigate the drop. I know that from the google update that unnatural backlinks would be penalized so I thought it might be the backlinks from a site that we advertise on because of the sheer number we have required from them in the last month. Would you think that would be the cause? if not, what could it be? and if it is, how do I go about correcting it as fast as possible? Any Help with this would be greatly appreciated. Many Thanks, Colin
Technical SEO | | digital.moretogether.com0 -
How many times robots.txt gets visited by crawlers, especially Google?
Hi, Do you know if there's any way to track how often robots.txt file has been crawled? I know we can check when is the latest downloaded from webmaster tool, but I actually want to know if they download every time crawlers visit any page on the site (e.g. hundreds of thousands of times every day), or less. thanks...
Technical SEO | | linklater0 -
Google description problem
Hi all, My website is www.ipbskinning.com I'm having a problem with how my site is appearing in google. I have this in the head of my website: <meta name='<a class="attribute-value">description</a>' content='<a class="attribute-value">Free and Custom IPB Skins for Invision Power Board.</a>'/> Yet when I google 'ipbskinning' it says: Solid Skins. 1We test all our skins in all browsers to insure that they are compatible. This ensures that your users have the best user experience. which is random text from the content of my site. Any idea why this is happening? Thanks a lot all
Technical SEO | | pezza34340 -
How long to reverse the benefits/problems of a rel=canonical
If this wasn't so serious an issue it would be funny.... Long store cut short, a client had a penalty on their website so they decided to stop using the .com and use the .co.uk instead. They got the .com removed from Google using webmaster tools (it had to be as it was ranking for a trade mark they didn't own and there are legal arguments about it) They launched a brand new website and placed it on both domains with all seo being done on the .co.uk. The web developer was then meant to put the rel=canonical on the .com pointing to the .co.uk (maybe not needed at all thinking about it, if they had deindexed the site anyway). However he managed to rel=canonical from the good .co.,uk to the ,com domain! Maybe I should have noticed it earlier but you shouldn't have to double check others' work! I noticed it today after a good 6 weeks or so. We are having a nightmare to rank the .co.uk for terms which should be pretty easy to rank for given it's a decent domain. Would people say that the rel=canonical back to the .com has harmed the co.uk and is harming with while the tag remains in place? I'm off the opinion that it's basically telling google that the co.uk domain is a copy of the .com so go rank that instead. If so, how quickly after removing this tag would people expect any issues caused by it's placement to vanish? Thanks for any views on this. I've now the fun job of double checking all the coding done by that web developer on other sites!
Technical SEO | | Grumpy_Carl0 -
Problems with changing the page title on Wordpress Site...
The website is http://www.masterpieceinteriors.co.uk/ I'm not sure why the homepage title is reading 'masterpieces' in Google but looks fine here. There seem to be 2 versions of the homepage coming up in google - you can see them both by searching 'masterpieceinteriors' and then 'masterpiece interiors'. Some help would be hugely appreciated!
Technical SEO | | Opiyo0