Using folder blocked by robots.txt before uploaded to indexed folder - is that OK?
-
I have a folder "testing" within my domain which is a folder added to the robots.txt. My web developers use that folder "testing" when we are creating new content before uploading to an indexed folder. So the content is uploaded to the "testing" folder at first (which is blocked by robots.txt) and later uploaded to an indexed folder, yet permanently keeping the content in the "testing" folder. Actually, my entire website's content is located within the "testing" - so same URL structure for all pages as indexed pages, except it starts with the "testing/" folder.
Question: even though the "testing" folder will not be indexed by search engines, is there a chance search engines notice that the content is at first uploaded to the "testing" folder and therefore the indexed folder is not guaranteed to get the content credit, since search engines see the content in the "testing" folder, despite the "testing" folder being blocked by robots.txt? Would it be better that I password protecting this "testing" folder?
Thx
-
good observation....
-
Yep, just to jump in on the above, if a competitor is paying attention to your robots.txt file, they might notice a sweet stash of content under the /testing folder that they can nab. I have actually seen something similar happen in the past in a competitive SEO niche, so something to bear in mind.
-
As long as the correct robots.txt setting has been applied to the /testing folder, then you do not have anything to worry about.
Considering that it is a staging environment, I would recommend securing it with a password just to be safe and secure with your non-production site content.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long will old pages stay in Google's cache index. We have a new site that is two months old but we are seeing old pages even though we used 301 redirects.
Two months ago we launched a new website (same domain) and implemented 301 re-directs for all of the pages. Two months later we are still seeing old pages in Google's cache index. So how long should I tell the client this should take for them all to be removed in search?
Intermediate & Advanced SEO | | Liamis0 -
Should I disallow all URL query strings/parameters in Robots.txt?
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters: /Mulligan-Practitioner-CD-ROM
Intermediate & Advanced SEO | | jmorehouse
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROM Additionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors. As I see it, I have two options: Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result). Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original. Any thoughts?0 -
Google is not indexing an updated website
We just relaunched a website that has 5 years old, we maintain all the old URLs and articles but for some reason google is not picking up the new website https://www.navisyachts.com. In Google Webmaster Tools we can see the sitemap with over 1000 pages submitted but shows nothing as indexed. The site is loosing traffic rapidly and positions, from the SEO side all looks fine for me. What can be wrong? I’ll appreciate any help. The new website is built over Joomla 3.4, we have it here at MOZ and other than some minor details it doesn't show that something can be wrong with the website. Thank you.
Intermediate & Advanced SEO | | FWC_SEO0 -
Magento Store Using Z-Blocks - Impact on SEO?
Hi Guys, I have a question relating to Z-Blocks in Magento. Our Magento store uses a lot of Z-Blocks, these are bits of content that are switched off and on depending on a customer’s user group. This allows us to target different offers and content to new customers (not logged in) and existing customers (logged in). Does anyone have any experience in how this impacts SEO? Thanks in advance!
Intermediate & Advanced SEO | | CarlWint0 -
Using Meta Header vs Robots.txt
Hey Mozzers, I am working on a site that has search-friendly parameters for their faceted navigation, however this makes it difficult to identify the parameters in a robots.txt file. I know that using the robots.txt file is highly recommended and powerful, but I am not sure how to do this when facets are using common words such as sizes. For example, a filtered url may look like www.website.com/category/brand/small.html Brand and size are both facets. Brand is a great filter, and size is very relevant for shoppers, but many products include "small" in the url, so it is tough to isolate that filter in the robots.txt. (I hope that makes sense). I am able to identify problematic pages and edit the Meta Head so I can add on any page that is causing these duplicate issues. My question is, is this a good idea? I want bots to crawl the facets, but indexing all of the facets causes duplicate issues. Thoughts?
Intermediate & Advanced SEO | | evan890 -
Robots.txt issue for international websites
In Google.co.uk, our US based (abcd.com) is showing: A description for this result is not available because of this site's robots.txt – learn more But UK website (uk.abcd.com) is working properly. We would like to disappear .com result totally, if possible. How to fix it? Thanks in advance.
Intermediate & Advanced SEO | | JinnatUlHasan0 -
Robots.txt
What would be a perfect robots.txt file my site is propdental.es Can i just place: User-agent: * Or should i write something more???
Intermediate & Advanced SEO | | maestrosonrisas0 -
How important is the number of indexed pages?
I'm considering making a change to using AJAX filtered navigation on my e-commerce site. If I do this, the user experience will be significantly improved but the number of pages that Google finds on my site will go down significantly (in the 10,000's). It feels to me like our filtered navigation has grown out of control and we spend too much time worrying about the url structure of it - in some ways it's paralyzing us. I'd like to be able to focus on pages that matter (explicit Category and Sub-Category) pages and then just let ajax take care of filtering products below these levels. For customer usability this is smart. From the perspective of manageable code and long term design this also seems very smart -we can't continue to worry so much about filtered navigation. My concern is that losing so many indexed pages will have a large negative effect (however, we will reduce duplicate content and be able provide much better category and sub-category pages). We probably should have thought about this a year ago before Google indexed everything :-). Does anybody have any experience with this or insight on what to do? Thanks, -Jason
Intermediate & Advanced SEO | | cre80