How can I get unimportant pages out of Google?
-
Hi Guys,
I have a (newbie) question, untill recently I didn't had my robot.txt written properly so Google indexed around 1900 pages of my site, but only 380 pages are real pages, the rest are all /tag/ or /comment/ pages from my blog. I now have setup the sitemap and the robot.txt properly but how can I get the other pages out of Google? Is there a trick or will it just take a little time for Google to take out the pages?
Thanks!
Ramon
-
If you want to remove an entire directory, you can exclude that directory in robots.txt, then go to Google Webmaster Tools and request a URL removal. You'll have an option to remove an entire directory there.
-
No, sorry. What I said is, if you mark the folder as disalow in robots.txt, it will not remove the pages are already indexed.
But the meta tag, when the spiders go again on the page and see that the pages are with the noindex tag will remove it.
Since you can not already include the directory on the robots.txt. Before removing the SE pages.
First you put the noindex tag on all pages you want to remove. After they are removed, it takes a week for a month. After you add the folders in robots.txt to your site who do not want to index.
After that, you dont need to worry about the tags.
I say this because when you add in the robots.txt first, the SE does not read the page anymore, so they would not read the meta noindex tag. Therefore you must first remove the pages with noindex tag and then add in robot.txt
Hope this has helped.
João Vargas
-
No, sorry. What I said is, if you mark the folder as disalow in robots.txt, it will not remove the pages are already indexed.
But the meta tag, when the spiders go again on the page and see that the pages are with the noindex tag will remove it.
Since you can not already include the directory on the robots.txt. Before removing the SE pages.
First you put the noindex tag on all pages you want to remove. After they are removed, it takes a week for a month. After you add the folders in robots.txt to your site who do not want to index.
After that, you dont need to worry about the tags.
I say this because when you add in the robots.txt first, the SE does not read the page anymore, so they would not read the meta noindex tag. Therefore you must first remove the pages with noindex tag and then add in robot.txt
Hope this has helped.
João Vargas
-
Thanks Vargas, If I choose for noindex, I should remove it from the robot.txt right?
I understood that if you have a noindex tag on the page and as well a dissallow in the robot.txt the SE will index it, is that true?
-
For you remove the pages you want, need to put a tag:
<meta< span="">name="robots" content="noindex">If you want internal links and external relevance to pass on these pages, you put:
<meta< span="">name="robots" content="noindex, follow">If you do the lock on robot.txt: only need to include the tag in the current urls, new search engines will index no.
In my opinion, I do not like using the google url remover. Because if someday you want to index these folders, will not, at least it has happened to me.
The noindex tag works very well to remove objectionable content, within 1 month or so now will be removed.</meta<></meta<>
-
Yes. It's only a secondary level aid, and not guaranteed, yet it could help speed up the process of devaluing those pages in Google's internal system. If the system sees those, and cross-references to the robots.txt file it could help.
-
Thanks guys for your answers....
Alan, do you mean that I place the tag below at all the pages that I want out of Google? -
I agree with Alan's reply. Try canonical 1st. If you don't see any change, remove the URLs in GWT.
-
There's no bulk page request form so you'd need to submit every URL one at a time, and even then it's not a guaranteed way. You could consider gettting a canonical tag on those specific pages that provides a different URL from your blog, such as an appropriate category page, or the blog home page. That could help speed things up, but canonical tags themselves are only "hints" to Google.
Ultimately it's a time and patience thing.
-
It will take time, but you can help it along by using the url removal tool in Google Webmaster Tools. https://www.google.com/webmasters/tools/removals
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I still monitor noindex, nofollow pages with Google Analytics?
I have a private/login site where all pages are noindex, nofollow. Can I still monitor external site links with Google Analytics?
Technical SEO | | jasmine.silver0 -
Why does Google's search results display my home page instead of my target page?
Why does Google's search results display my home page instead of my target page?
Technical SEO | | h.hedayati6712365410 -
Blog page won't get indexed
Hi Guys, I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now. I found this in the robots.txt file: Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/ I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt? Cheers!
Technical SEO | | Happy-SEO2 -
My sites "pages indexed by Google" have gone up more than qten-fold.
Prior to doing a little work cleaning up broken links and keyword stuffing Google only indexed 23/333 pages. I realize it may not be because of the work but now we have around 300/333. My question is is this a big deal? cheers,
Technical SEO | | Billboard20120 -
I have custom 404 page and getting so much 404 error on Google webmaster, what should i do?
I have a custom 404 page with popular post and category links in the page, everyday i have 404 crawl error on webmaster tools, what should i do?
Technical SEO | | rimon56930 -
If a page isn't linked to or directly sumitted to a search engine can it get indexed?
Hey Guys, I'm curious if there are ways a page can get indexed even if the page isn't linked to or hasn't been submitted to a search engine. To my knowledge the following page on our website is not linked to and we definitely didn't submit it to Google - but it's currently indexed: <cite>takelessons.com/admin.php/adminJobPosition/corp</cite> Anyone have any ideas as to why or how this could have happened? Hopefully I'm missing something obvious 🙂 Thanks, Jon
Technical SEO | | TakeLessons0 -
Getting a citation page indexed
Howdy mozzers, I have a citation on a .govt domain with 2 links pointing to my site. The page is not indexed by Google, bing or yahoo. URL; http://www.familyservices.govt.nz/directory/viewprovider.htm?id=17077 I have tried getting the paged indexed by building bookmark links to it. I have tweeted the url and gotten a few re-tweets for it. But no luck. The page has got no nofollow meta tag. Other listings have been indexed by google. Could someone please advise on means to help me get the page indexed? A strategy that I have not yet tried is submitting a sitemap that includes the external url as I am not sure if it is possible to include url's not part of my domain. Any advice, help would be greatly appreciated. viva le SEOmoz Thanks
Technical SEO | | ihms1 -
Why does our page show a description in english in google spain?
Hi! We have a multilingual page and I have set in Google Webmaster Tools the language preference for the root domain to be none, Spanish for the .com/es, English for the .com/en, and German for the .com/de. The title and description show in the right language in Google Germany and google UK, but in google.es (Spain) the title and description appear in English instead of Spanish. Does anybody know why could this be happening and how to fix it? kJtF3.png
Technical SEO | | inmonova0