Restricted by robots.txt does this cause problems?
-
I have restricted around 1,500 links which are links to retailers website and links that affiliate links accorsing to webmaster tools
Is this the right approach as I thought it would affect the link juice? or should I take the no follow out of the restricted by robots.txt file
-
Hello Ocelot,
I am assuming you have a site that has affiliate links and you want to keep Google from crawling those affiliate links. If I am wrong, please let me know. Going forward with that assumption then...
That is one way to do it. So perhaps you first send all of those links through a redirect via a folder called /out/ or /links/ or whatever, and you have blocked that folder in the robots.txt file. Correct? If so, this is how many affiliate sites handle the situation.
I would not rely on rel nofollow alone, though I would use that in addition to the robots.txt block.
There are many other ways to handle this. For instance, you could make all affilaite links javascript links instead of href links. Then you could put the javascript into a folder called /js/ or something like that, and block that in the robots.txt file. This works less and less now that Google Preview Bot seems to be ignoring the disallow statement in those situations.
You could make it all the same URL with a unique identifyer of some sort that tells your database where to redirect the click. For example:
www.yoursite.com/outlink/mylink#123
or
www.yoursite.com/mylink?link-id=123
In which case you could then block /mylink in the robots.txt file and tell Google to ignore the link-ID parameter via Webmaster Tools.
As you can see, there is more than one way to skin this cat. The problem is always going to be doing it without looking like you're trying to "fool" Google - because they WILL catch up with any tactic like that eventually.
Good luck!
Everett
-
From a coding perspective, applying the nofollow to the links is the best way to go.
With the robots.txt file, only the top tier search engines respect the information contained within, so lesser known bots or spammers might check your robots.txt file to see what you don't want listed, and that info will give them a starting point to look deeper into your site.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I add my html sitemap to Robots?
I have already added the .xml to Robots. But should I also add the html version?
Technical SEO | | Trazo0 -
What can be the cause for difference in local rankings between mobile and desktop?
I have a site that ranks differently for the same search term on mobile and desktop computer. I'm based in Glasgow, and the search term is (I've replaced the term with X's) XXXXXX XXXXX Glasgow Searching from a location in Glasgow: Desktop: Snackpack : 2, Organic : 6
Technical SEO | | johanisk
Mobile: Snackpack: 1, Organic : 10 I'm keen to improve on the Organic positions as this term is a lead generating one for me. My site is mobile friendly and scores 69/100 on the speed test. Do you think bumping the pagespeed well "into the green" would help improve it's position? Is there anything else I should look at?0 -
Doctype language declaration problem
Hello,
Technical SEO | | Silviu
I have a problem with an SEM Rush warning on a website audit, for www.enjoyprepaid.com. It tells me "5852 pages are lacking language declaration", but I don't understand what it means and how to actually fix this problem. Also I run a W3 validator and have a doctype and language problem but again don't understand what they mean and how to fix them https://validator.w3.org/nu/?doc=http%3A%2F%2Fwww.enjoyprepaid.com%2FAfghanistan-calling-cards-2.html0 -
Fix duplicate content caused by tags
Hi everyone, TGIF. We are getting hundreds of duplicate content errors on our WP site by what appears to be our tags. For each tag and each post we are seeing a duplicate content error. I thought I had this fixed but apparently I do not. We are using the Genesis theme with Yoast's SEO plugin. Does anyone have the solution to what I imagine is this easy fix? Thanks in advance.
Technical SEO | | okuma0 -
How can i see the pages that cause duplicate content?
SEOmoz PRO is giving me back duplicate content errors. However, i don't see how i can get a list of pages that are duplicate to the one shown. If i don't know which pages/urls cause the issue i can't really fix it. The only way would be placing canonical tags but that's not always the best solution. Is there a way to see the actual duplicate pages?
Technical SEO | | 5MMedia0 -
Robots.txt and 301
Hi Mozzers, Can you answer something for me please. I have a client and they have 301 re-directed the homepage '/' to '/home.aspx'. Therefore all or most of the linkjuice is being passed which is great. They have also marked the '/' as nofollow / noindex in the Robots.txt file so its not being crawled. My question is if the '/' is being denied access to the robots is it still passing on the authority for the links that go into this page? It is a 301 and not 302 so it would work under normal circumstances but as the page is not being crawled do I need to change the Robots.txt to crawl the '/'? Thanks Bush
Technical SEO | | Bush_JSM0 -
.htaccess problem using POST method
Hi guys I'm after some help with trying to achieve the following: 1. Canonicalise to http://www. 2. Remove the index.php from root and subfolders. I have the .htaccess code below, which seemed to work fine, but the urls use the POST method and this isn't working with the rewrites. Can anyone please advise as to what I am doing wrong? As you can probably guess .htaccess isn't my strongest SEO discipline! The code I have is: http:// to http://www. RewriteEngine on
Technical SEO | | TrevorJones
RewriteCond %{HTTP_HOST} ^mydomainexample.com
RewriteRule (.*) http://www.mydomainexample.com/$1 [R=301,L] /index.php to / Options +FollowSymLinks
DirectoryIndex index.php RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.php\ HTTP/
RewriteRule ^index.php$ http://www.mydomainexample.com/ [R=301,L] Subdirectory /index.php to / RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)index.(php|html|htm?)[#?]?
RewriteRule ^(([^/]+/))index.(php|html|htm?)$ http://www.mydomainexample.com/$1 [R=301,L] Just to add to this I have found this which I think is what I need to restrict it to GET: RewriteCond %{THE_REQUEST} ^GET.*index\.php [NC]RewriteRule (.*?)index\.php/*(.*) /$1$2 [R=301,L] Thank you in advance for any suggestions as to how I may put this code together.. Trevor0 -
E-Commerce Site Crawling Problem
Our website displays all of the products in our website If you attempt to visit a category or page that doesn't exist but conforms to our site url structure. Somehow google crawled these pages and indexed them, and they have TONS of duplicate content that hurt us. How do I deal with this problem?
Technical SEO | | 13375auc30