How to Hide Directories in Search?
-
I noticed bad 404 error links in Google Webmaster Tools and they were pointing to directories that do not have an actual page, but hold information.
Ex: there are links pointing to our PDF folder which holds all of our pdf documents. If i type in , example.com/pdf/ it brings up a unformated webpage that displays all of our PDF links.
How do I prevent this from happening. Right now I am blocking these in my robots.txt file, but if i type them in, they still appear.
Or should I not worry about this?
-
Yes, a visit to example.com/dir should now return a 404 error (if you haven't done any redirecting/canonicalizing). This will increase your 404 count in Web Master tools but it's far preferable to the alternative. If you're not redirecting the robots.txt will eventually work and hopefully the links will just fall out of WMT.
-
My hosting company turned off directory browsing and now everything is how it should be. So to my understanding, if the server sees a file that does not have a index file, it should not be view able and should be forbidden. This shoujld not affect us from an SEO standpoint should it? My hosting company said they disabled all directories in our site, however everything still works, except for the forbidden file directories.
-
Basically it shouldn't really have an affect; those unformatted file listings are literally the web server automatically saying 'here's the files that are in this folder', there's no meta tags, description, on page elements, etc.
If you have these pages and they're ranking well, you generally don't want them to be. The automatic file browsing pages don't have your name, your company, etc. in them, and they're generally pretty ugly. They also theoretically could be 'stealing' juice from your 'real' pages, if your internal structure isn't flowing relevance properly.
Basically what I'm saying is that if these pages are having some kind of SEO effect, you probably don't want them to be since they're so basic.
Also I can't overstate the security concerns that directory browsing might be introducing. If someone can directory browse to where your code lives (.php, .aspx.vb, whatever) they may be able to read it. Code sometimes has important things like logins, passwords, merchant account ids, etc. in it that you definitely don't want people reading.
-
Agreed with Valerie that step 1 is to turn off those directory listing pages - that can be a security issue and you don't necessarily want people to see/access the whole list. Also, make doubly sure you don't have any internal links to that directory (Google crawled it somehow).
Generally, Robots.txt should prevent crawling, but it's not foolproof, and it's pretty bad about removing pages once they're indexed. If you can block the page from browsing and return a 404 for the root page, that should be fine. The other option would be to have the page removed in Google Webmaster Tools. You could request removal for the entire folder, but I'm guessing that you may want the actual PDFs indexed.
-
Will turning of directory browsing affect Search for all directories?
-
I really don't want to 301 redirect them as they are just holding files. This is happening with my includes file too. that holds our header, footer, navigation etc. I can check with our hosting company to find out.
-
I'd create an index.html for the directory, and then redirect it somewhere. This way, you're capturing the inbound links and then rescuing some of the inbound juice.
Otherwise, you can also check out this post for more info on other solutions and modifying your htaccess file to prevent the directory view - http://perishablepress.com/better-default-directory-views-with-htaccess/
-
Blocking it in robots.txt will work to hide it from search engines.
If you want to hide it from users or people to who type in the url, you can simply drop a blank "index.html" in the /pdf folder.
-
I would suggest 301'ing them to their /index.htm or /pdf.htm equivalents. If you don't know, a 301 is a signal to a web browser (or search crawler) saying "this page has permanently moved, please go to (otherpage.htm) instead".
Here's a good SEOMoz article explaining it a bit more:
http://www.seomoz.org/learn-seo/redirection
What might be more of a concern, is it sounds like your web server has directory browsing enabled. This could be a security issue (depending on your web server setup). Generally you don't want to expose directories if you don't have to because it gives a potential attacker insight into your system setup. Here's an example how to do it in Apache:
www.camelrichard.org/topics/Apache/Turn_OffDirectoryBrowsing
And IIS:
technet.microsoft.com/en-us/library/cc731109(v=ws.10).aspx
If you like I can confirm if you have open directories if you give me the link, either here or through private message.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO suggestions for a directory
Hi all, I am new to SEO. I work for a ratings and review website, like TripAdvisor and LinkedIn. How would one go about setting up SEO strategy for national directories that have local suggested pages? What can be a good practice. For example, Tripadvisor has many different restaurants across the UK. What would they do to improve their SEO? How do they target correct links? How do they go about building their Moz Score? Would really appreciate your thoughts and suggestions. Thanks!
Intermediate & Advanced SEO | | Eric_S
Eric0 -
Not showing up in search results for non-branded terms
Hello! Can anyone see any glaring reasons why this post: "98 Book Marketing Ideas That Can Help Authors Increase Sales" isn't on page one of Google — or even page 10! — for the term "book marketing ideas"? Many other sites with lower domain and page authority — even ones linking to this article — are ranking on the first ten pages for this term, and I can't figure out why we're not appearing anywhere. The same thing is happening for ALL of our other blog posts, and the keywords they're optimized for. According to GA, the only terms we're getting clicks from are branded keywords. This subdomain is now 2 years old, and the domain bookbub.com has been around for 5 years. Our domain authority is 61. We have the Yoast SEO plugin installed and are following all the standard SEO best practices. We have enough external links to at least be ranking within the first 10 pages of this Google search. I feel like there's something technically wrong, maybe in the code or backend, but nobody here can figure it out, and our hosting provider WP Engine has no ideas. Moz is returning crawl errors on our site, mainly "Error Code 804: HTTPS (SSL) Error Encountered" and "Error Code 803: Incomplete HTTP Response Received." I have confirmed with WP Engine that everything is set up correctly on our end, and that this is a known Moz issue. I've reached out to Moz's support team about this, and am awaiting a response. But what else am I missing? There's got to be something — I've been blogging for 10 years for different companies and my own personal websites, and I've never come across anything like this before. I'm completely stuck! I'd appreciate any insights you can offer. Thanks in advance! 🙂 EDIT: I heard back from Moz on those errors. The 804 errors are a Moz-side issue — their crawler isn't equipped to be able to handle SNI. They're looking into a resolution, and this wouldn't affect search engine crawlers. Regarding the 803 error: "When you see an 803 error, that means your site closed its TCP connection to our crawler before our crawler could read a complete HTTP response. You don't see this error when you go to the page in your browser because content-length is an outdated component for modern browsers and they will disregard this error, but the intention of our crawler is to report any errors that might be occurring. So the crawler is configured to detect and report such errors." The only thing I can think to do here is go back to WP Engine with this information, but other than that, I'm not sure what this could mean or how to fix it, or if this might be the underlying technical issue keeping us from ranking.
Intermediate & Advanced SEO | | bookbubpartners1 -
News Errors In Google Search Console
Years ago a site I'm working on was publishing news as one form of content on the site. Since then, has stopped publishing news, but still has a q&a forum, blogs, articles... all kinds of stuff. Now, it triggers "News Errors" in GWT under crawl errors. These errors are "Article disproportionately short" "Article fragmented" on some q&a forum pages "Article too long" on some longer q&a forum pages "No sentences found" Since there are thousands of these forum pages and it's problem seems to be a news critique, I'm wondering what I should do about it. It seems to be holding these non-news pages to a news standard: https://support.google.com/news/publisher/answer/40787?hl=en For instance, is there a way and would it be a good idea to get the hell out of Google News, since we don't publish news anymore? Would there be possible negatives worth considering? What's baffling is, these are not designated news urls. The ones we used to have were /news/title-of-the-story per... https://support.google.com/news/publisher/answer/2481373?hl=en&ref_topic=2481296 Or, does this really not matter and I should just blow it off as a problem. The weird thing is that we recently went from http to https and The Google News interface still has us as http and gives the option to add https, which I am reluctant to do sine we aren't really in the news business anymore. What do you think I should do? Thanks!
Intermediate & Advanced SEO | | 945010 -
Incorrect URL shown in Google search results
Can anyone offer any advice on how Google might get the url which it displays in search results wrong? It currently appears for all pages as: <cite>www.domainname.com › Register › Login</cite> When the real url is nothing like this. It should be: www.domainname.com/product-type/product-name. This could obviously affect clickthroughs. Google has indexed around 3,000 urls on the site and they are all like this. There are links at the top of the page on the website itself which look like this: Register » Login » which presumably could be affecting it? Thanks in advance for any advice or help!
Intermediate & Advanced SEO | | Wagada0 -
Should I switch all paid-for directory backlinks to nofollow backlinks?
Hello Mozzers, I'm looking at a niche party services directory (b2c), established for over 8 years. They're not using nofollow tags on backlinks from their paid entries (free entries only get phone numbers and not backlinks). If they suddenly switch all the paid-for backlinks in their directory to nofollow backlinks, might that have some kind of negative impact. Switching sounds like the best way forward, but I want to avoid any unintended consequences. Perhaps I should only implement this change gradually? Thanks in advance, Luke Edited 30 minutes ago by Luke Rowland
Intermediate & Advanced SEO | | McTaggart0 -
Indexing of internal search results: canonicalization or noindex?
Hi Mozzers, First time poster here, enjoying the site and the tools very much. I'm doing SEO for a fairly big ecommerce brand and an issue regarding internal search results has come up. www.example.com/electronics/iphone/5s/ gives an overview of the the model-specific listings. For certain models there are also color listings, but these are not incorporated in the URL structure. Here's what Rand has to say in Inbound Marketing & SEO: Insights From The Moz Blog Search filters are used to narrow an internal search—it could be price, color, features, etc.
Intermediate & Advanced SEO | | ClassicDriver
Filters are very common on e-commerce sites that sell a wide variety of products. Search filter
URLs look a lot like search sorts, in many cases:
www.example.com/search.php?category=laptop
www.example.com/search.php?category=laptop?price=1000
The solution here is similar to the preceding one—don’t index the filters. As long as Google
has a clear path to products, indexing every variant usually causes more harm than good. I believe using a noindex tag is meant here. Let's say you want to point users to an overview of listings for black 5s iphones. The URL is an internal search filter which looks as follows: www.example.com/electronics/apple/iphone/5s?search=black Which you wish to link with the anchor text "black iphone 5s". Correct me if I'm wrong, but if you no-index the black 5s search filters, you lose the equity passed through the link. Whereas if you canonicalize /electronics/apple/iphone/5s you would still leverage the link juice and help you rank for "black iphone 5s". Doesn't it then make more sense to use canonicalization?0 -
Why does the site I am working on have so few visits from organic search results?
Hello! I am not very experienced with SEO, but I am trying to help out on a site that has been around since 2010 and has well over a thousand pages of high-quality, original content, with more being added all the time. Only around 65 of the site's daily visits come from organic search results; this seems very low. There has already been significant SEO work done on the site. Is there something about the site that strikes anyone as obviously getting in the way of organic traffic? The URL is ellenjovin.com. I would appreciate any thoughts you may have. Thank you very much!
Intermediate & Advanced SEO | | nyc-seo0 -
SEOMoz Link Directory - As Silly as I think it is?
Don't get me wrong, I LOVE LOVE LOVE SEOmoz, but their "Link Directory" (www.seomoz.org/directories) is a bit deceiving. I was looking for a list of DIRECTORIES that Moz recommends, not a bunch of places where you can pay for advertising. On top of that, it also lists dmoz as one of the spots to get links from, but have you ever actually ever been able to get a link from dmoz? I know I haven't, and we've been trying to get a link for years. Anyone else disappointed in this list? Does anyone have a good list of directories? -Andy P.S. I love you SEOmoz! Don't hate me for this critique!
Intermediate & Advanced SEO | | alhallinan1