How can I exclude display ads from robots.txt?
-
Google has stated that you can do this to get spiders to content only, and faster. Our IT guy is saying it's impossible.
Do you know how to exlude display ads from robots.txt?Any help would be much appreciated.
-
You'd want to make the URL paths where the display ads live to have the crawl disallowed in your robots.txt, just like any other section of your site. Here's some basics on robots.txt.
Hope this helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you confirm legitimate Google Bot traffic?
We use Cloudflare as a firewall. I noticed a significant number of blocks of bot traffic. One of the things they do is try to block bad bot traffic. But it seems they are mistakenly blocking Google Bot traffic. If you use Cloudflare, you may want to look into this as well. Also, can you confirm if the following IPs are for legitimate Google Bots? 66.249.79.88
Technical SEO | | akin67
66.249.79.65
66.249.79.80 66.249.79.76 Thanks,1 -
Bloking pages in roborts.txt that are under a redirected subdomain
Hi Everyone, I have a lot of Marketo landing pages that I don't want to show in SERP. Adding the noindex meta tag for each page will be too much, I have thousands of pages. Blocking it in roborts.txt could have been an option, BUT, the subdomain homepage is redirected to my main domain (with a 302) so I may confuse search engines ( should they follow the redirect or should they block) marketo.mydomain.com is redirected to www.mydomain.com disallow: / (I think this will be confusing with the redirect) I don't have folders, all pages are under the subdomain, so I can't block folders in Robots.txt also Would anyone had this scenario or any suggestions? I appreciate your thoughts here. Thank you Rachel
Technical SEO | | RaquelSaiz0 -
Can Google Crawl This Page?
I'm going to have to post the page in question which i'd rather not do but I have permission from the client to do so. Question: A recruitment client of mine had their website build on a proprietary platform by a so-called recruitment specialist agency. Unfortunately the site is not performing well in the organic listings. I believe the culprit is this page and others like it: http://www.prospect-health.com/Jobs/?st=0&o3=973&s=1&o4=1215&sortdir=desc&displayinstance=Advanced Search_Site1&pagesize=50000&page=1&o1=255&sortby=CreationDate&o2=260&ij=0 Basically as soon as you deviate from the top level pages you land on pages that have database-query URLs like this one. My take on it is that Google cannot crawl these pages and is therefore having trouble picking up all of the job listings. I have taken some measures to combat this and obviously we have an xml sitemap in place but it seems the pages that Google finds via the XML feed are not performing because there is no obvious flow of 'link juice' to them. There are a number of latest jobs listed on top level pages like this one: http://www.prospect-health.com/optometry-jobs and when they are picked up they perform Ok in the SERPs, which is the biggest clue to the problem outlined above. The agency in question have an SEO department who dispute the problem and their proposed solution is to create more content and build more links (genius!). Just looking for some clarification from you guys if you don't mind?
Technical SEO | | shr1090 -
Can hreflang replace canonicalisation ?
Hi Im working with a site that has ALOT of duplicate content and have recommended developer fix via correct use of Canonicalisation i.e the canonical tag. However a US version (of this UK site) is about to be developed on a subfolder (domain.com/uk/ & domain.com/US/ etc so also looking into adopting the hreflang attribute on these. Upon reading up about the hreflang attribute i see that it performs a degree of canonicalisation too. Does that mean that developing the international versions with hreflang means there's no need to apply canonicalistion tags to deal with the dupe content, since will deal with the original dupe content problems as well as the new country related dupe content, via the hreflang ? I also understand that hreflang and canonicalisation can conflict/clash on different language versions of international subfolders etc as per: http://www.youtube.com/watch?v=Igbrm1z_7Hk In this instance we are only looking at US/UK versions but very likely will want to expand into non english countries too in the future like France for example. So given both the above points if you are using hreflang is it advisable (or even best) to totally avoid the canonical tag ? I would be surprised if the answers yes, since whilst makes logical sense given the above (if the above statements are correct), that seems strange given how important and standard best practice canonical usage seems to be these days. What best ? Use the Hreflang alone, or the Canonical tag alone or both ? What does everyone else do in similar situation ? All Best Dan
Technical SEO | | Dan-Lawrence0 -
How can I find my Webmaster Tools HTML file?
So, totally amateur hour here, but I can't for the life of me find our HTML verification file for webmaster tools. I see nowhere to look at it in Google Webmaster Tools console, I tried a site:, I googled it, all the info out there is about how to verify a site. Ours is verified, but I need the verification file code to sync up with the Google API and no one seems to have it. Any thoughts?
Technical SEO | | healthgrades0 -
Restricted by robots.txt does this cause problems?
I have restricted around 1,500 links which are links to retailers website and links that affiliate links accorsing to webmaster tools Is this the right approach as I thought it would affect the link juice? or should I take the no follow out of the restricted by robots.txt file
Technical SEO | | ocelot0 -
Un-Indexing a Page without robots.txt or access to HEAD
I am in a situation where a page was pushed live (Went live for an hour and then taken down) before it was supposed to go live. Now normally I would utilize the robots.txt or but I do not have access to either and putting a request in will not suffice as it is against protocol with the CMS. So basically I am left to just utilizing the and I cannot seem to find a nice way to play with the SE to get this un-indexed. I know for this instance I could go to GWT and do it but for clients that do not have GWT and for all the other SE's how could I do this? Here is the big question here: What if I have a promotional page that I don't want indexed and am met with these same limitations? Is there anything to do here?
Technical SEO | | DRSearchEngOpt0 -
Robots exclusion
Hi All, I have an issue whereby print versions of my articles are being flagged up as "duplicate" content / page titles. In order to get around this, I feel that the easiest way is to just add them to my robots.txt document with a disallow. Here is my URL make up: Normal article: www.mysite.com/displayarticle=12345 Print version of my article www.mysite.com/displayarticle=12345&printversion=yes I know that having dynamic parameters in my URL is not best practise to say the least, but I'm stuck with this for the time being... My question is, how do I add just the print versions of articles to my robots file without disallowing articles too? Can I just add the parameter to the document like so? Disallow: &printversion=yes I also know that I can do add a meta noindex, nofollow tag into the head of my print versions, but I feel a robots.txt disallow will be somewhat easier... Many thanks in advance. Matt
Technical SEO | | Horizon0