Meta robots at every page rather than using robots.txt for blocking crawlers? How they'll get indexed if we block crawlers?
-
Hi all,
The suggestion to use meta robots tag rather than robots.txt file is to make sure the pages do not get indexed if their hyperlinks are available anywhere on the internet. I don't understand how the pages will be indexed if the entire site is blocked? Even though there are page links are available, will Google really index those pages? One of our site got blocked from robots file but internal links are available on internet for years which are not been indexed. So technically robots.txt file is quite enough right? Please clarify and guide me if I'm wrong.
Thanks
-
I agree with Gaston's approach right up to step 4. If you add the no-indexed pages back into a block in the robots.txt file, you'll end up back where you started from. Because Google will still discover the no-indexed URLs elsewhere and the robots,txt block will stop them from discovering the no-index, and the URLs will likely start to get added to the index again.
No-indexed URLs must not be blocked in robots.txt. Those two processes are mutually exclusive.
-
Hi there,
TLDR; The solution to deindexing and never index again:
- Allow (with robots.txt) the web to be crawable
- Aplly meta robots tag: noindex,follow
- Wait somte weeks to be completely deindexed
- block the entire site/section with robots.txt
Robots.txt and the robots meta tag can make the same effect, but to understand them must be analyzed separatedly.
-
Robots.txt, here you just tell bots where they can go BEFORE they crawl any of the website. This is just a signal, not a directive... Because robots can choose to ignore the what's in the file. Here you can block from the entire web, to an entire section or just specific pages. More info: Robots.txt official page and a really cool and complete guide to robots.txt
-
Robots meta tag, with it you have more signals to tell, the most used are: noindex, nofollow and follow, due to the usual issues about indexing. More info: Robots.txt offical page, Google developers, Meta Robots directive - Moz and a complete guide to meta robots tag - YOAST.
Hope this is what you wanted.
Best luck
GR.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Tens of duplicate homepages indexed and blocked later: How to remove from Google cache?
Hi community, Due to some WP plugin issue, many homepages indexed in Google with anonymous URLs. We blocked them later. Still they are in SERP. I wonder whether these are causing some trouble to our website, especially as our exact homepages indexed. How to remove these pages from Google cache? Is that the right approach? Thanks
Algorithm Updates | | vtmoz0 -
Meta descriptions
When writing a meta description is it ok to use keywords that are not on the page or site itself?
Algorithm Updates | | aplnzoctober181 -
Linking from high ranking sub domain pages to less ranking main domain pages to benefit latter
Hi all, We have our product guide pages on sub domain which are years old, so have some backlinks and high ranking for the beand related queries. Now we created new guide pages on our main website and we want these new pages to rank top beating the old pages from sub domain. Again we can't deindex or rel canonical to solve the issue as there are some part of users still using the old pages. We are planning to give a link from every old page of sub domain to same new page on main domain. Will this linking increases the authority of new pages technically and helps in ranking better? Like we give a link to "Moz guide 1" page to "Moz guide 2" page to rank latter better. Thanks
Algorithm Updates | | vtmoz0 -
Google cant read my robots.txt from past 10 days
http://awesomescreenshot.com/08d1s6aybc hi, my robots.txt is http://wallpaperzoo.com/robots.txt google says it cant read and has postponed the crawl.. its been 10 days and no crawl.. please help me in solving this issue.. this is save with http://hdwallpaperzones.com/robots.txt
Algorithm Updates | | toxicpls0 -
Sudden drop in rankings and indexed pages!
Over the past few days I have noticed some apparent major changes. Before I explain, let me say this: Checking my analytics and WMT: There is an increase in traffic (even via google organic) There is no drop in impressions or clicks There is no drop in indexed pages in GWT Having said that; When I check my indexed pages using site:www.mywebsite.com, I see only 30 results as opposed to the 120K that I was seeing before (it was steadily climbing). The indexed pages have increase 3 fold in the past year, because of the increase in pages, updates, and products on the site. I see a sudden drop in rankings for major keywords that had been steadily rising. For example, I had some major keywords that were on page 7-8, not they are on page 20+ or not at all. Also, the page that used to show in the rankings has changed. I have only done white-hat guest blogging in the past year for link building, on a small scale (maybe 20-30 links in a year). They only other change recently, is that we are: Posting products on Houzz and Pinterest daily adding our site to all local directories (white pages, Yelp, citysearch, etc.) My site got hit by Penguin more than a year ago, but we have done everything right since, and our traffic via organic results has more than doubled since the Penguin release. What the hell is going on? Should I be concerned?
Algorithm Updates | | inhouseseo0 -
Google doesnt index my Google+ Profile
Hey guys! I know it sounds like a novice question, but I have checked ALL THE BOXES THAT TELL GOOGLE TO INDEX MY GOOGLE+ PROFILE. It is Visible for search - 100%. It's been 3 weeks since I opened a Google+ profile and it still hasn't been indexed for its name. Any guesses what's going on? (It's not this name so don't try to google me)
Algorithm Updates | | Yoav_Vilner0 -
Why google index ip address instead of the domain name?
I have a website ,now google index ip address of it instead of the domain name,I have used 301 redirected to the domain name,but how to change the index IP to its domain name? And why google index the IP address?
Algorithm Updates | | frankfans1170