Robots.txt help
-
Hi Moz Community,
Google is indexing some developer pages from a previous website where I currently work:
ddcblog.dev.examplewebsite.com/categories/sub-categories
Was wondering how I include these in a robots.txt file so they no longer appear on Google. Can I do it under our homepage GWT account or do I have to have a separate account set up for these URL types?
As always, your expertise is greatly appreciated,
-Reed
-
The robots.txt would allow the OP to go back into GWT and request removal of the dev site from the index. Password protecting a dev site is usually a pretty good idea, too.
-
Can you not just add a htaccess password to the directory to keep the dev site up, but keep bots out?
-
You'll want a separate account for that subdomain, and also put the robots.txt excluding that subdomain in that subdomain itself.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt & Disallow: /*? Question!
Hi, I have a site where they have: Disallow: /*? Problem is we need the following indexed: ?utm_source=google_shopping What would the best solution be? I have read: User-agent: *
Intermediate & Advanced SEO | | vetofunk
Allow: ?utm_source=google_shopping
Disallow: /*? Any ideas?0 -
Need Help - Lost 75% Of Traffic Since May 2018
Sorry to go in-depth here, but want to give all available information. We went live late April 2018 with our two websites in Shopify (moved from Magento, same admin, different storeviews...which we find later to cause some issues). Both of these websites sell close to the same products (we purchased a competitor about 5 years ago, which is why we have two). The nice thing is that they do almost identical amounts in sales. They have done very well for years, especially in the last two years. Well, the core algo update around May 22nd-24th 2018 happened and wiped out about 65% of our Google traffic for one website (MySupplementStore.com). And this latest update, wiped out another 20%. I couldn't figure out why this would have happened, because we were very cautious about keeping things separate, unique descriptions etc. So I did some digging and this is what I found: The reviews we migrated over from Magento somehow were combined and added to both websites. This is something I didn't notice. I had this resolved a month ago so that each site's reviews are now only on that website. Our blog section was duplicated across both websites during the migration. Again, something I didn't notice, as we have close to over 1,000 blog posts per site. This was resolved two weeks ago. As I was looking more, I found that the last 6 months, a person working for us (for 3 years), started writing descriptions and pasting them on both websites, instead of making them unique to each website. I trusted her for years, but I think she just got lazy. She quit about a month before the migration as well. We are currently working on this, but its been taking awhile because we have over 5,000 products on each site and have no idea which ones are duplicates. I did also notice: Site very slow when checking site speed tools. Working on that this week. When I take snippets of text or do searches, many times it shows up in omitted results. No messages in Google Webmaster Tools So the question is... Do you think it is the duplicate content issues that caused the drop? Our other site is Best Price Nutrition, which didn't see a big drop at all during that update. If not, any other ideas why?
Intermediate & Advanced SEO | | vetofunk0 -
Branding and Page Titles - Please Help
Hello, I have a question about page titles. How important is branding here? I'm not referring to the company name, but rather the terminology that's used as "branding language" for a company. For example, let's say that the it would be a good idea to target the keyword "Restaurant Coupons" based on search volume and competition. However, our branding adheres to the language "Dining Offers". Is it considered a bad idea to use "Restaurant Coupons" in the page title? Or is that considered inconsistent branding? Basically, I'm just trying to figure out the correct balance between the SEO value of words and adhering to a company's branding. Any help is appreciated! Thanks,
Intermediate & Advanced SEO | | atmosol
Nick1 -
Block subdomain directory in robots.txt
Instead of block an entire sub-domain (fr.sitegeek.com) with robots.txt, we like to block one directory (fr.sitegeek.com/blog).
Intermediate & Advanced SEO | | gamesecure
'fr.sitegeek.com/blog' and 'wwww.sitegeek.com/blog' contain the same articles in one language only labels are changed for 'fr' version and we suppose that duplicate content cause problem for SEO. We would like to crawl and index 'www.sitegee.com/blog' articles not 'fr.sitegeek.com/blog'. so, suggest us how to block single sub-domain directory (fr.sitegeek.com/blog) with robot.txt? This is only for blog directory of 'fr' version even all other directories or pages would be crawled and indexed for 'fr' version. Thanks,
Rajiv0 -
Robots.txt
Hi all, Happy New Year! I want to block certain pages on our site as they are being flagged (according to my Moz Crawl Report) as duplicate content when in fact that isn't strictly true, it is more to do with the problems faced when using a CMS system... Here are some examples of the pages I want to block and underneath will be what I believe to be the correct robots.txt entry... http://www.XYZ.com/forum/index.php?app=core&module=search&do=viewNewContent&search_app=members&search_app_filters[forums][searchInKey]=&period=today&userMode=&followedItemsOnly= Disallow: /forum/index.php?app=core&module=search http://www.XYZ.com/forum/index.php?app=core&module=reports&rcom=gallery&imageId=980&ctyp=image Disallow: /forum/index.php?app=core&module=reports http://www.XYZ.com/forum/index.php?app=forums&module=post§ion=post&do=reply_post&f=146&t=741&qpid=13308 Disallow: /forum/index.php?app=forums&module=post http://www.XYZ.com/forum/gallery/sizes/182-promenade/small/ http://www.XYZ.com/forum/gallery/sizes/182-promenade/large/ Disallow: /forum/gallery/sizes/ Any help \ advice would be much appreciated. Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Using folder blocked by robots.txt before uploaded to indexed folder - is that OK?
I have a folder "testing" within my domain which is a folder added to the robots.txt. My web developers use that folder "testing" when we are creating new content before uploading to an indexed folder. So the content is uploaded to the "testing" folder at first (which is blocked by robots.txt) and later uploaded to an indexed folder, yet permanently keeping the content in the "testing" folder. Actually, my entire website's content is located within the "testing" - so same URL structure for all pages as indexed pages, except it starts with the "testing/" folder. Question: even though the "testing" folder will not be indexed by search engines, is there a chance search engines notice that the content is at first uploaded to the "testing" folder and therefore the indexed folder is not guaranteed to get the content credit, since search engines see the content in the "testing" folder, despite the "testing" folder being blocked by robots.txt? Would it be better that I password protecting this "testing" folder? Thx
Intermediate & Advanced SEO | | khi50 -
Urgent Help - Ecommerce URL best practice for SEO
Guys i need some urgent help here as we need to get this sorted out soon. We have a page similar to wayfair shop the look: www.wayfair.com/Shop-The-Look/ What are the best practices for URL structure if we applies 2-3 filters? Is wayfair style good for SEO? FYI: We create our crawlable, link friendly AJAX website using pushstate() but unsure of the structure for this case. We followed http://moz.com/blog/create-crawlable-link-friendly-ajax-websites-using-pushstate advice.
Intermediate & Advanced SEO | | WayneRooney0 -
Blocking Dynamic URLs with Robots.txt
Background: My e-commerce site uses a lot of layered navigation and sorting links. While this is great for users, it ends up in a lot of URL variations of the same page being crawled by Google. For example, a standard category page: www.mysite.com/widgets.html ...which uses a "Price" layered navigation sidebar to filter products based on price also produces the following URLs which link to the same page: http://www.mysite.com/widgets.html?price=1%2C250 http://www.mysite.com/widgets.html?price=2%2C250 http://www.mysite.com/widgets.html?price=3%2C250 As there are literally thousands of these URL variations being indexed, so I'd like to use Robots.txt to disallow these variations. Question: Is this a wise thing to do? Or does Google take into account layered navigation links by default, and I don't need to worry. To implement, I was going to do the following in Robots.txt: User-agent: * Disallow: /*? Disallow: /*= ....which would prevent any dynamic URL with a '?" or '=' from being indexed. Is there a better way to do this, or is this a good solution? Thank you!
Intermediate & Advanced SEO | | AndrewY1