Indexed, though blocked by robots.txt: Need to bother?
-
Hi,
We have intentionally blocked some of the website files which were indexed for years. Now we receive a message "Indexed, though blocked by robots.txt" in GSC. We can ignore as per my knowledge? Are any actions required about this? We thought of blocking them with meta tags but these are PDF files.
Thanks
-
Hi there!
What Google is telling you is that you are indexing URLs that you probably are not wanting to be indexed, or the other way around, that important pages are being blocked but indexed for other reasons.
If I might ask, why did you blocked through robots.txt those files?
There most 2 answers are:
1- Wanted to remove those from search results. If this is your case, you've solved only a part of the problem. What you should have done is (previously allowing robots to crawl those urls) apply noindex rules (keep in mind that can be set up in the HTTP header, as long as not html files cant have meta robots tag), then after a sufficient time block them in robots.txt.
_2- Optimize how GoogleBot (crawiling) time. _Being this case, then you've done it correctly and there is nothing to worry.Hope this help.
Best luck.
GR
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our sitemap is not indexed i Google even though it's successfully processed
Hi, Ours is a WP hosted website. We have submitted the XML sitemap with a WP plugin. It's been successfully processed by Google but it's not been indexed in and can't be found in SERP. How to get this indexed? Will there be any low crawling of sitemap as it's not indexed? Thanks
Algorithm Updates | | vtmoz0 -
Only half of the sitemap is indexed
I have a website with high domain authority and high quality content and blog. I've resubmitted the sitemap half a dozen times. Search console getr half way through and then stops. Does anyone know any reason for this? I've seen the usual responses of 'google is not obligated to crawl you' but this site has been fully crawled in the past. It's very odd Does anyone have any ideas why it might stop half way - or does anyone know a testing tool that might illuminate the situation?
Algorithm Updates | | Andrew-SEO0 -
Google Index
Hi all, I just submit my url and linked pages along with xml map to index. How long does it take google to index my new pages?
Algorithm Updates | | businessowner0 -
Is it possible that Google may have erroneous indexing dates?
I am consulting someone for a problem related to copied content. Both sites in question are WordPress (self hosted) sites. The "good" site publishes a post. The "bad" site copies the post (without even removing all internal links to the "good" site) a few days after. On both websites it is obvious the publishing date of the posts, and it is clear that the "bad" site publishes the posts days later. The content thief doesn't even bother to fake the publishing date. The owner of the "good" site wants to have all the proofs needed before acting against the content thief. So I suggested him to also check in Google the dates the various pages were indexed using Search Tools -> Custom Range in order to have the indexing date displayed next to the search results. For all of the copied pages the indexing dates also prove the "bad" site published the content days after the "good" site, but there are 2 exceptions for the very 2 first posts copied. First post:
Algorithm Updates | | SorinaDascalu
On the "good" website it was published on 30 January 2013
On the "bad" website it was published on 26 February 2013
In Google search both show up indexed on 30 January 2013! Second post:
On the "good" website it was published on 20 March 2013
On the "bad" website it was published on 10 May 2013
In Google search both show up indexed on 20 March 2013! Is it possible to be an error in the date shown in Google search results? I also asked for help on Google Webmaster forums but there the discussion shifted to "who copied the content" and "file a DMCA complain". So I want to be sure my question is better understood here.
It is not about who published the content first or how to take down the copied content, I am just asking if anybody else noticed this strange thing with Google indexing dates. How is it possible for Google search results to display an indexing date previous to the date the article copy was published and exactly the same date that the original article was published and indexed?0 -
Sudden drop in rankings and indexed pages!
Over the past few days I have noticed some apparent major changes. Before I explain, let me say this: Checking my analytics and WMT: There is an increase in traffic (even via google organic) There is no drop in impressions or clicks There is no drop in indexed pages in GWT Having said that; When I check my indexed pages using site:www.mywebsite.com, I see only 30 results as opposed to the 120K that I was seeing before (it was steadily climbing). The indexed pages have increase 3 fold in the past year, because of the increase in pages, updates, and products on the site. I see a sudden drop in rankings for major keywords that had been steadily rising. For example, I had some major keywords that were on page 7-8, not they are on page 20+ or not at all. Also, the page that used to show in the rankings has changed. I have only done white-hat guest blogging in the past year for link building, on a small scale (maybe 20-30 links in a year). They only other change recently, is that we are: Posting products on Houzz and Pinterest daily adding our site to all local directories (white pages, Yelp, citysearch, etc.) My site got hit by Penguin more than a year ago, but we have done everything right since, and our traffic via organic results has more than doubled since the Penguin release. What the hell is going on? Should I be concerned?
Algorithm Updates | | inhouseseo0 -
How to speed up indexing of my site...
Only 4 out of the 12 pages of my blog/site have been indexed. How can I ensure all the pages get indexed? I'm using a wordpress site, and I also wondered how could I speed the indexing process up (I have submitted a site map) Thanks!
Algorithm Updates | | copywritingbuzz0 -
Trying to figure out why one of my popular pages was de-indexed from Google.
I wanted to share this with everyone for two reasons. 1. To try to figure out why this happened, and 2 Let everyone be aware of this so you can check some of your pages if needed. Someone on Facebook asked me a question that I knew I had answered in this post. I couldn't remember what the url was, so I googled some of the terms I knew was in the page, and the page didn't show up. I did some more searches and found out that the entire page was missing from Google. This page has a good number of shares, comments, Facebook likes, etc (ie: social signals) and there is certainly no black / gray hat techniques being used on my site. This page received a decent amount of organic traffic as well. I'm not sure when the page was de-indexed, and wouldn't have even known if I had't tried to search for it via google; which makes me concerned that perhaps other pages are being de-indexed. It also concerns me that I have done something wrong (without knowing) and perhaps other pages on my site are going to be penalized as well. Does anyone have any idea why this page would be de-indexed? It sure seems like all the signals are there to show Google this page is unique and valuable. Interested to hear some of your thoughts on this. Thanks
Algorithm Updates | | NoahsDad0 -
Selection of the Right Keywords - Some insights needed!
I have recently begun with my content and keyword selections. I used the Adword's keyword tool and for eg: got a few keywords like laser skin treatment dermatology dermatologists cosmetic laser surgery The competition for these are low and medium. Now what I understand is. If I wish to use them in my articles on content generation I can have them as medium and long tail keywords to write around. So for eg: laser skin treatment = " Benefits of a laser skin treatment in India" and the url for this article could be /laser-skin-treatment For dermatology = "best dermatology practices" url could be /dermatology-practices Do the above happen to be the medium and long tail keywords? Am I going in the correct direction. How do I judge and come out with medium and long tail keywords. Please suggest Thanks
Algorithm Updates | | shanky10