Removing CSS & JS Files from Index
-
Hi,
Google has indexed a few .CSS and .JS files that belong to our WordPress plugins and themes. I had them blocked via robots, but realized this doesn't prevent indexation (and can likely hurt us since Google wants to access these files).
I've since removed the robots instructions, submitted a removal request via Search Console, but want to make sure they don't come back.
Is there a way to put a noindex tag within .CSS and .JS files? Or should I do something with .htaccess instead?
-
I figured .htaccess would be the best route. Thank you for researching and confirming. I appreciate it.
-
Hi Tim,
Assigning a noindex tag to these files will not block them, only prevent them from showing in SERPs. This is the intended goal and the reason I deleted my robots.txt file which prevented crawling.
-
There's quite a big difference between crawling directives, which block and indexing directives. This article by (former?) Moz user S_ebastian_ is a good foundation read.
This article at developers.google.com is a good second read. If I'm understanding it right, Google thinks in terms of crawling directives vs indexing / serving directives.
My attempt at <tl rl="">:</tl>
crawling = looking, using in any way :: controlled via robots.txt
indexing / serving = indexing, archiving, displaying snippets in results, etc :: controlled via html meta tags or web server htaccess (or similar for other web servers).
I'm not convinced yet, that asking for noindex via htaccess causes the same sort of grief that deny in robots.txt causes.
-
I would seriously think again when it comes to blocking/no-indexing your CSS and JS files - Google has in the past stated that if they cannot fully render your site properly then this could lead to poorer rankings.
You will also likely get notifications in your Search Console as errors for this too.
Check out this great article from July this year which goes into more details.
-
I haven't encountered undesirable .css or .js indexing myself (yet), but as you surmised, maybe this htaccess directive might be worth trying?
<filesmatch ".(txt|log|xml|css|js)$"="">Header set X-Robots-Tag "noindex"</filesmatch>
Google seems to support it
-
Unless I'm severely misreading the links provided, which I've read before, it seems Google is stating that they read, render, and sometimes index .CSS and .JS files. Here's an article written a week after the second article you posted.
The aforementioned WordPress plugin and theme files hosted on my server are indeed showing up in Google SERPs.
I do not want to prevent Googlebot from reaching these files as they're needed for optimal site performance, but I do want them to be no-indexed. Thus, I don't want robots.txt to prevent crawling, only indexing.
Let me know if I'm misunderstanding.
-
TL;DR - You're hesitated about problem that doesn't exist.
Googlebot doesn't index CSS or JS files. They index text files, HTML, PDF, DOC, XLS, etc. But doesn't index style sheets or javascript files.
All you need in WordPress is to create blank robots.txt file where WP is installed with this content:
User-agent: *
Disallow:
Sitemap: http://site/sitemap-file-name.xmlAnd that's all. This is explain many times:
http://googlewebmastercentral.blogspot.bg/2014/05/understanding-web-pages-better.html
http://googlewebmastercentral.blogspot.bg/2014/10/updating-our-technical-webmaster.html
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
JS loading blocker
Is there a tool, or Chrome extension I can use to load a page, identify the .js on the page, 'uncheck' selected .js and load the page again to check loading correctly? Even better to be able to defer/move to the end of the file to test.
Technical SEO | | MickEdwards0 -
Delete indexed spam pages
Hi everyone, I'm hoping someone had this same situation, or may know of a solution. One of our sites was recently pharmahacked 😞 We found an entire pharmaceutical site in one of the folder of our site. We were able to delete it, but now Google is showing us on not found error for those pages we deleted. First, I guess the question is will this harm us? If so, anyway we can fix this? Obliviously we don't want to do a 303 redirect for spam pages. Thanks!
Technical SEO | | Bridge_Education_Group0 -
Why google indexed pages are decreasing?
Hi, my website had around 400 pages indexed but from February, i noticed a huge decrease in indexed numbers and it is continually decreasing. can anyone help me to find out the reason. where i can get solution for that? will it effect my web page ranking ?
Technical SEO | | SierraPCB0 -
Indexing a catalogue
A client of mine has a large printed product catalogue that they post on their website as a pdf. Should I take a different approach of posting this catalogue in order to gain SEO value?
Technical SEO | | garymeld0 -
Redirects & 404 Errors
Hi everyone, I'm probably missing some GLARING error here, but I'm hoping you can help me! We recently built a new website on Wordpress and attempted to use a redirect plugin to take care of some old pages. The issue we are having though, is that when you click an old link you are not automatically redirected and instead are given a 404 error page. Then, when you try to view another page (by clicking a navigation item), every pages shows a 404 error. I implemented a redirect plugin, however it seems to start to work then still throws the 404 page. I believe this has something to do with the htaccess file which has the standard WP rewrite info in there... The way the old site was setup was kind of janky, so wondering if it's on that side or if I'm just going crazy. An old URL example would be http://orchards inn.com/index.php/specials and the new page is http://orchardsinn.com/special-offers. Sometimes the redirect seems to work, and others it actually throws a 404 page, then every other page in the navigation is 404'd as well. Your help is GREATLY appreciated!!
Technical SEO | | marisolmarketing0 -
Help! Pages not being indexed
Hi Mozzers, I need your help.
Technical SEO | | bshanahan
Our website (www.barnettcapitaladvisors.com) stopped being indexed in search engines following a round of major changes to URLs and content. There were a number of dead links for a few days before 301 redirects were properly put in place. And now, only 3 pages show up in bing when I do the search "site:barnettcapitaladvisors.com". A bunch of pages show up in Google for that search, but they're not any of the pages we want to show up. Our home page and most important services pages are nowhere in search results. What's going on here?
Our sitemap is at http://www.barnettcapitaladvisors.com/sites/default/files/users/AndrewCarrillo/sitemap/sitemap.xml
Robots.txt is at: http://www.barnettcapitaladvisors.com/robots.txt Thanks!0 -
How do I eliminate indexed products?
Please help! We got clobbered by Penguin and are at risk of having to close down after 10 years. We have been trying to figure out why and believe now it might be because of duplicate content. We added 2" inserts in March (over 500): http://www.trophycentral.com/inserts1.html Even though each is a different products, SEOMOZ is saying they are considered duplicate content. Given the timing, we think this might be the cause, even though it is totally legitimate. Question - since these are now indexed and since we can't easily add content quickly, what is the best way to handle this situation? A no-index tag? Is there a way to let Google know that their algorithm is detroying legitimate businesses??
Technical SEO | | trophycentraltrophiesandawards0 -
Root vs. Index.html
Should I redirect index.html to "/" or vice versa? Which is better for duplicate content issues?
Technical SEO | | DavetheExterminator0