Robots.txt & meta noindex--site still shows up on Google Search
-
I have set up my robots.txt like this:
User-agent: *
Disallow: /and I have this meta tag in my on a Wordpress site, set up with SEO Yoast
name="robots" content="noindex,follow"/>
I did "Fetch as Google" on my Google Search Console
My website is still showing up in the search results and it says this:
"A description for this result is not available because of this site's robots.txt"
This site has not shown up for years and now it is ranking above my site that I want to rank for this keyword. How do I get Google to ignore this site? This seems really weird and I'm confused how a site with little content, that has not been updated for years can rank higher than a site that is constantly updated and improved.
-
CleverPhd,
Really since to see a detailed yet to the point answer.
Thanks for contributing, and being in the Moz community.
Regards,
Vijay
-
Thanks for that clarification CleverPhD, forgot to mention that.
-
This one has my vote. You have to allow them access in order to see that you don't want the pages indexed. If you block them from seeing this rule...well they won't be able to see it.
-
Just to be clear on what Logan said. You have to allow Google to crawl your site by opening up your robots.txt to Google so it can see your noindex directive that is on each of the pages. Otherwise Google will never "see" the noindex directive on your pages.
Likewise, on sitemap.xml. If you are not allowing Google to crawl the sitemap (because you are blocking it with robots.txt) then Google will not read the sitemap, find all your pages that have the noindex directive on them and then remove those pages from the index.
A great article is here
https://support.google.com/webmasters/answer/93710?hl=en&ref_topic=4598466
From the mouth of Google "Important! For the noindex meta tag to be effective, the page must not be blocked by a robots.txt file. If the page is blocked by a robots.txt file, the crawler will never see the noindex tag, and the page can still appear in search results, for example if other pages link to it."
The other point that logan makes is that Google might list your site if there are enough sites linking to it. The steps above should take care of this, as you are deindexing the page, but here is what I am thinking he is referencing
https://www.youtube.com/watch?v=KBdEwpRQRD0
Google will include a site that is blocked in robots.txt if enough pages link to it, even if they have not crawled the url.
You can go into Search Console and find all the links that they say are pointing to your site. You can also use tools like CognitiveSEO or Ahrefs, Majestic or Moz etc and gather up all of those sites to find links to your site and include those in a disavow file that you put into Search Console and tell Google to ignore all of those links to your site.
Secret bonus method. Putting a noindex directive in your robots
https://www.deepcrawl.com/knowledge/best-practice/robots-txt-noindex-the-best-kept-secret-in-seo/
This allows you to manage your noindex directives in your robots.txt. Makes it easier as you can control all your noindex directives from a central location and block whole folders at a time. This would stop Google from crawling AND indexing pages all in one page and you can just leave the rest of the site alone and not worry about if a noindex tag should or should not be on a certain page.
Good luck!
-
As mentioned by Logan,noindex meta tag
is the most effective way to remove indexed pages. It sometimes takes time, you have to submit the right sitemap.xml which cover the pages/post you wish to get removed from google index.
-
I did read that about the robots.txt and that is why I added the noindex.
I use SEO Yoast for sitemap.xml, so shouldn't all my pages be there? I believe they are because I just looked at it a couple days ago.
So are you saying I should look through my backlink profile (WMT) and try to remove any backlinks?
Would 'Fetch as Google' not ping Google to tell them to recrawl?
Thanks for your help.
-
Hi,
First things first, it's a common misconception that the robots.txt disallow: / will prevent indexing. It's only indented to prevent crawling, which is why you don't get a meta description pulled into the result snippet. If you have links pointing to that page and a disallow: / on your robots, it's still eligible for indexation.
Second, it's pretty weird that the noindex tag isn't effective, as that's the only sure-fire way to get de-indexed intentionally. I would recommend creating an XML sitemap for all URLs on that domain that are noindex'd and resubmit that in Search Console. If Google hasn't crawled your site since adding the noindex, they don't know it's there. In my experience, forcing them to recrawl via XML submission has been effective at getting noindex noticed quicker.
I would also recommend taking a look at the link profile and removing any possible links pointing to your noindex pages, this will help future attempts at indexing.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My site Has Penalized By google Search Result Without Any Spam Score.
I Recently Make a Site Gizmocombot.com. tHE aITE has NO spam Record NO lousy BACKLINK.it has all unique article can anyone tell us how we can unpenalized our site from google webmaster and google search Result. i attcead a screenshot as well yoou need. 3nzmALp
Technical SEO | | litoginamaaba3332 -
Why Google ranks a page with Meta Robots: NO INDEX, NO FOLLOW?
Hi guys, I was playing with the new OSE when I found out a weird thing: if you Google "performing arts school london" you will see w w w . mountview . org. uk at the 3rd position. The point is that page has "Meta Robots: NO INDEX, NO FOLLOW", why Google indexed it? Here you can see the robots.txt allows Google to index the URL but not the content, in article they also say the meta robots tag will properly avoid Google from indexing the URL either. Apparently, in my case that page is the only one has the tag "NO INDEX, NO FOLLOW", but it's the home page. so I said to myself: OK, perhaps they have just changed that tag therefore Google needs time to re-crawl that page and de-index following the no index tag. How long do you think it will take to don't see that page indexed? Do you think it will effect the whole website, as I suppose if you have that tag on your home page (the root domain) you will lose a lot of links' juice - it's totally unnatural a backlinks profile without links to a root domain? Cheers, Pierpaolo
Technical SEO | | madcow780 -
Meta Description Being Picked up from another site!?
Hi, when we search for a phrase (which is the most searched for phrase for our company) the meta description which is displayed isnt the one we set, and it hasnt picked it up from any text on the page. The description is incorrect, it says we have an office in a city that we dont, and it just isnt a very good description generally. What has been suggested to us by our website developers is that the description is being picked up by google from a website which lists companies details. The description which is displayed on that website, is the same as the description which is shown for our company in the search results. But is it possible for Google to ignore the meta description which is set in our homepage and the other text on the home page, and pickup the text from another website and use it as our description? Many Thanks
Technical SEO | | danieldunn100 -
Links under Meta Description when performing a search
Doing research for clients, I have came across seeing sites displaying hyperlinks underneath their own meta description. keywords that I have googled that result with hyperlinks displaying under meta descriptions: Google'd: iacquire (brand) bmw wheels (Beyern Wheels, position 1) aftermarket bmw wheels (MMR Wheels, position 2) These companys have hyperlinks underneath their descriptions. Anyone have any ideas why this happens or how it happens?
Technical SEO | | frnprz0 -
Google site: operator showing only 30 results for whatever website you may like, omitting the rest
site:wikipedia.org site:seomoz.org site:nytimes.com site:WHATEVER YOU PUT HERE 🙂 is currently always showing just 3 SERP pages and the well known ugly message: In order to show you the most relevant results, we have omitted some entries very similar to the 30 already displayed.
Technical SEO | | j.royal
If you like, you can repeat the search with the omitted results included. Any idea what's going on?!!? Ongoing update?0 -
Does Site Structure Affect Google
Hi - I'm pretty new at this. We’re running an e-commerce affiliate site at http://www.mydomain.com. So we don’t take payments but customer gets passed through to third party sites when they select to buy a product. We have a blog at http://www.mydomain.com/news. I think Google is treating these 2 sites as as separate sites for PR. For this reason we're thinking about moving this to http://news.mydomain.com. Anyone have any experience in this?
Technical SEO | | richardjoseph0 -
Google seems to be penalizing my site for some reason
I recently took control of a website which did have some pretty big SEO problems, duplicate content being one of the main ones!! Looking back at ranking data the website ranked very well for it's main keyword, #5 for Google, Yahoo and Bing. The ranking then dropped in February 2012 for Google to #64 but stayed the same for Yahoo and Bing. I scrapped the dodgy content and completely rewrote it using a Wordpress framework about 6 weeks ago, still targeting the same keywords and 301 redirecting the old pages to the new pages where applicable. My rankings for Yahoo and Bing are still maintaining their page 1 rankings but Google is still ranking the website on page 5/6. My question is. Is the website getting punished for something that was part of the old website? If so how can I find out what it is and fix it? This website ranked on page 1 for Google for most of it's popular keywords but now it doesn't. I appreciate any feedback Many Thanks : )
Technical SEO | | alexhowe0 -
Sitemap.xml showing up in Google Search
Hello when I do a Google search my sitemap.xml shows up for lots of queries. Does anyone have any advise on this? Should I remove url in Google Webmaster? Thanks,
Technical SEO | | Socialdude0