Best way to stop pages being indexed and keeping PageRank
-
If for example on a discussion forum, what would be the best way to stop pages such as the posting page (where a user posts a topic or message) from being indexed AND not diluting PageRank too? If we added them to the Disallow on robots.txt, would pagerank still flow through the links to those blocked pages or would it stay concentrated on the linking page? Your ideas and suggestions will be greatly appreciated.
-
Hi Peter,
pages blocked by robots.txt would be considered to be not there, thus not flowing pagerank. You might want to use "noindex, follow" on these pages: pages are crawled and links on the page would be followed, by that any recieved linkjuice would flow from these pages to others. Noindex would mean that these pages wouldn't dilute PR (and ranking).
Furthermore is "noindex,follow" on a page to page basis faster and more secure keeping pages nonindexed than by robots.txt (which is only crawled every 12 hours or so).
You might want to use noindex,follow on all non-important pages such as legal etc.
Sebastian
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it best practice to have a canonical tags on all pages
The website I'm working on has no canonical tags. There is duplicate content so rel=canonicals need adding to certain pages but is it best practice to have a tag on every page ?
Intermediate & Advanced SEO | | ColesNathan0 -
Same language, Different countries. What would be the best way to introduce it?
Hello, We have a .com magento store with the US geo targeting We're going to launch a different versions soon, one for the US, and another one for Canada (we're going to add a Spanish and French versions later as well) The stores content will be same, except currency and contact us page. What would be a better strategy to introduce it to Google? What is better URL structure? example.com/ca/ , example.com/en-ca/ , or ca.example.com/ ? Should we stay with the original www.example.com/ (example.com) and just close an access to /ca/ and /us/ / or use rel=canonical / or use "alternate" hreflang to avoid duplicate content issues? Thanks in advance
Intermediate & Advanced SEO | | Meditinc.com0 -
Best way to remove low quality paginated search pages
I have a website that has around 90k pages indexed, but after doing the math I realized that I only have around 20-30k pages that are actually high quality, the rest are paginated pages from search results within my website. Every time someone searches a term on my site, that term would get its own page, which would include all of the relevant posts that are associated with that search term/tag. My site had around 20k different search terms, all being indexed. I have paused new search terms from being indexed, but what I want to know is if the best route would be to 404 all of the useless paginated pages from the search term pages. And if so, how many should I remove at one time? There must be 40-50k paginated pages and I am curious to know what would be the best bet from an SEO standpoint. All feedback is greatly appreciated. Thanks.
Intermediate & Advanced SEO | | WebServiceConsulting.com0 -
Best possible linking on site with 100K indexed pages
Hello All, First of all I would like to thank everybody here for sharing such great knowledge with such amazing and heartfelt passion.It really is good to see. Thank you. My story / question: I recently sold a site with more than 100k pages indexed in Google. I was allowed to keep links on the site.These links being actual anchor text links on both the home page as well on the 100k news articles. On top of that, my site syndicates its rss feed (Just links and titles, no content) to this page. However, the new owner made a mess, and now the site could possibly be seen as bad linking to my site. Google tells me within webmasters that this particular site gives me more than 400K backlinks. I have NEVER received one single notice from Google that I have bad links. That first. But, I was worried that this page could have been the reason why MY site tanked as bad as it did. It's the only source linking so massive to me. Just a few days ago, I got in contact with the new site owner. And he has taken my offer to help him 'better' his site. Although getting the site up to date for him is my main purpose, since I am there, I will also put effort in to optimizing the links back to my site. My question: What would be the best to do for my 'most SEO gain' out of this? The site is a news paper type of site, catering for news within the exact niche my site is trying to rank. Difference being, his is a news site, mine is not. It is commercial. Once I fix his site, there will be regular news updates all within the niche we both are in. Regularly as in several times per day. It's news. In the niche. Should I leave my rss feed in the side bars of all the content? Should I leave an achor text link on the sidebar (on all news etc.) If so: there can be just one keyword... 407K pages linking with just 1 kw?? Should I keep it to just one link on the home page? I would love to hear what you guys think. (My domain is from 2001. Like a quality wine. However, still tanked like a submarine.) ALL SEO reports I got here are now Grade A. The site is finally fully optimized. Truly nice to have that confirmation. Now I hope someone will be able to tell me what is best to do, in order to get the most SEO gain out of this for my site. Thank you.
Intermediate & Advanced SEO | | richardo24hr0 -
Adding Orphaned Pages to the Google Index
Hey folks, How do you think Google will treat adding 300K orphaned pages to a 4.5 million page site. The URLs would resolve but there would be no on site navigation to those pages, Google would only know about them through sitemap.xmls. These pages are super low competition. The plot thickens, what we are really after is to get 150k real pages back on the site, these pages do have crawlable paths on the site but in order to do that (for technical reasons) we need to push these other 300k orphaned pages live (it's an all or nothing deal) a) Do you think Google will have a problem with this or just decide to not index some or most these pages since they are orphaned. b) If these pages will just fall out of the index or not get included, and have no chance of ever accumulating PR anyway since they are not linked to, would it make sense to just noindex them? c) Should we not submit sitemap.xml files at all, and take our 150k and just ignore these 300k and hope Google ignores them as well since they are orhpaned? d) If Google is OK with this maybe we should submit the sitemap.xmls and keep an eye on the pages, maybe they will rank and bring us a bit of traffic, but we don't want to do that if it could be an issue with Google. Thanks for your opinions and if you have any hard evidence either way especially thanks for that info. 😉
Intermediate & Advanced SEO | | irvingw0 -
Is it possible to get a list of pages indexed in Google?
Is there a tool that will give me a list of pages on my site that are indexed in Google?
Intermediate & Advanced SEO | | rise10 -
Disallowed Pages Still Showing Up in Google Index. What do we do?
We recently disallowed a wide variety of pages for www.udemy.com which we do not want google indexing (e.g., /tags or /lectures). Basically we don't want to spread our link juice around to all these pages that are never going to rank. We want to keep it focused on our core pages which are for our courses. We've added them as disallows in robots.txt, but after 2-3 weeks google is still showing them in it's index. When we lookup "site: udemy.com", for example, Google currently shows ~650,000 pages indexed... when really it should only be showing ~5,000 pages indexed. As another example, if you search for "site:udemy.com/tag", google shows 129,000 results. We've definitely added "/tag" into our robots.txt properly, so this should not be happening... Google showed be showing 0 results. Any ideas re: how we get Google to pay attention and re-index our site properly?
Intermediate & Advanced SEO | | udemy0 -
How to Preserve PageRank for Disappearing Pages?
Pretend that USA Today has a section of their site where the sell electronics, which they locate at http://electronics.usatoday.com. The subdomain is powered by an online electronics store called NewCo via a white label. Many of the pages on this subdomain have relatively high PageRank. But few, if any, external sites link to the subdomain--the PageRank of the subdomain is largely due to internal links from the usatoday.com root domain. USA Today's deal with NewCo expires and they decide to partner with my startup instead. But, unlike NewCo, we won't be providing a white-label solution; rather, USA Today will be redirecting all of the electronics-related links on their root domain to my site instead of the electronics.usatoday.com subdomain. They also agree to direct all of the pages on electronics.usatoday.com to me. Ideally USA Today would add 301's to all of their pages on electronics.usatoday.com that direct to the corresponding pages on my site, but they don't have the engineering wherewithal or resources to do this. Therefore, what is the best way to pass the PageRank from the electronics.usatoday.com pages to my site? Would it work to have USA Today change the CNAME for electronics.usatoday.com to my site and then create pages on my site that mimic the USA today URL structure? For example, let's say there was a page located at electronics.usatoday.com/ipods. Could we give electronics.usatoday.com a CNAME form my site and then create a page on my site located at mysite.com/ipods that 301'ed to the ipod page on my site? Would that preserve the PageRank?
Intermediate & Advanced SEO | | jack789078900