Internal linking question
-
Hi there. Are all internal links listed in GWMT actually indexed?
-
Jonnygeekuk,
If GWT is telling you they are "aware" (whether indexed or not) of URLs that you do not want indexed, and you have either blocked them in the robot.txt file or the robots header tag, or the page serves a 404 or 410 response in the http header, it wouldn't hurt to use the URL removal tool to remove those pages from the index just to be sure.
-
So, sounds like you're looking for a list of indexed pages? Will this tool help?
http://www.intavant.com/tools/google-indexed-pages-extractor/
-
I'm sorry it's taking me so long to get back to you on this. However you told me you say you're using the removal tool in Google Webmaster tools?
I want to be certain you're not using the link disavow tool as a removal tool is that correct?
"Google updates its entire index regularly. When we crawl the web, we automatically find new pages, remove outdated links, and reflect updates to existing pages, keeping the Google index fresh and as up-to-date as possible.
If outdated pages from your site appear in the search results, ensure that the pages return a status of either 404 (not found) or 410 (gone) in the header. These status codes tell Googlebot that the requested URL isn't valid. Some servers are misconfigured to return a status of 200 (Successful) for pages that don't exist, which tells Googlebot that the requested URLs are valid and should be indexed. If a page returns a true 404 error via the http headers, anyone can remove it from the Google index using the webpage removal request tool. Outdated pages that don't return true 404 errors usually fall out of our index naturally when other pages stop linking to them."
"
Reincluding content in search
"Content removed using the URL removal tool will not appear in search results for a minimum of 90 days or until the content has been removed from the Google index. However, if you've updated robots.txt, added meta tags, or password-protected content to prevent it being crawled, the content should naturally have dropped out of our index, and you shouldn't need to worry about it reappearing after 90 days. You can reinclude your content at any time during the 90-day period by following the steps below.
Reinclude content:
- On the Webmaster Tools Home page, click the site you want.
- In the left-hand menu, click Optimization, and then click Remove URLs.
- Select the Removed content tab, and then click Reinclude next to the content you want to reinclude in the Google index.
Pending requests are usually processed within 3-5 business days."
-
Hi Chris, Thomas
Thanks for taking the time to reply.
Essentially, the reason i'm asking this question is recently the site in question became heavily over indexed due to search filters etc becoming indexed. This resulted in a ton of thin content being indexed. We've since no indexed these pages but they are taking time to drop off so we are helping a little by using the removal tool in GWMT. A lot of these pages are hidden, it's difficult to find them in the main index but index status says we still have >7k pages indexed when we really should have fewer than 2k. A site: command reveals about 9k but only 600 are listed and they are all valid pages. Basically we're trying to find the urls to remove and noticed that a lot of them are listed in the internal links tab on GWMT. I just wondered whether it was advisable to remove these too, in addition to the 2.5k we have already removed.
-
Hi Johnny, I want to tell you that I agree with what Chris stated above. If you're looking for someone to confirm that. You want to also make sure you do not have over 100 to 150 URLs or internal links on your site. This will hurt Google indexing of the website.
I also use a tool to make internal links. And if that is what you are speaking of. It's called http://scribecontent.com. You can use it not only on word press but on all sites. I have found it to be extremely useful please be cautious though it how many links you built internally so that you do not create a page that cannot be indexed correctly.
http://www.distilled.net/u/search-engine-basics/#crawling
I hope I've been in help,
Thomas
-
Hey JonnyG,
Be sure not to confuse links with URLs. Essentially, a link is clickable thing on a web page that, when clicked, takes the user to another URL. A URL is an address (non-clickable) . A web page is the resource that exists at a URL.
Anyway, the Internal Links tab shows how many links exist on your site that can take you to other pages on your site. However, if you click on the Health | Index Status tab, you'll get choices to see Basic and Advanced info on your indexed URLs. In the advanced tab, you'll see the total number of pages Google's index on your site. Google's Webmaster Tools Help has a page on Index Status for more info.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Cant find internal links on one of my pages.
When I run open site explorer for www.kingremodeling.com/ss.php?pid=5 it says there are 40 links to it on my site. However, I cannot find these links on any of the pages that open site explorer lists such as my homepage www.KingRemodeling.com. Totally confused!
Technical SEO | | allb830 -
Disavow questions
Pretty sure I know the answers to these but someone asked me to make absolutely sure so here goes, any opinions welcome: If i disavow a whole domain does it include all sub-domains on the domain also?- my answer is clearly yes. If i have network of links really bad linking to my website that are already nofollow but awful websites to be linked on, is it worth putting them in the disavow list anyway to basically tell Google literally no association? I know the whole point of disavow is to essentially nofollow the link. Opinions much appreciated, thank you guys.
Technical SEO | | tdigital0 -
Why my external links are zero
What could be the possibility that my Moz crawler showing zero external link for my website http://ultimatecharter.com, i have build many links from different website and when i click them it goes to the website. My website is multi language and the landing page is http://ultimatecharter.com/en/home can this be a possible issue? regards Aqeel
Technical SEO | | Aqeel0 -
Should I no follow all external links?
I have worked with a few different SEO firms lately and a lot of them have recommended on the sites I was working on to "no-follow" all external links on the site. On one hand this traps all the link equity/Pagerank. On the other I would think this practice is frowned upon by Google. What are some opinions on this?
Technical SEO | | MarloSchneider0 -
Back Link Question
Hi Folks, Our domain (www.alabu.com) has been around since 2000. We've accumulated a lot of back links over the years, many of which I don't recognize and didn't ask for. I've been reading on here recently about "cleaning up" back links. I do see a lot of ours that just aren't relevant and I don't know why they decided to link to us. We haven't gotten a warning from google or anything like that, but I wonder, how do I know if we could benefit from cleaning up our back links? Is there a benefit to it even if google hasn't warned us? Thanks! Hal
Technical SEO | | AlabuSkinCare0 -
Too Many On-Page Links on a Blog
I have a question about the number of on-page links on a page and the implications on how we're viewed by search engines. After SEOmoz crawls our website, we consistently get notifications that some of our pages have "Too Many On-Page Links." These are always limited to pages on our blog, and largely a function of our tag cloud (~ 30 links) plus categories (10 links) plus popular posts (5 links). These all display on every blog post in the sidebar. How significant a problem is this? And, if you think it is a significant problem, what would you suggest to remedy the problem? Here's a link to our blog in case it helps: http://wiredimpact.com/blog/ The above page currently is listed as having 138 links. Any advice is much appreciated. Thanks so much. David
Technical SEO | | WiredImpact0 -
Mod rewrite question
Sorry in advance if this isn't the best place to ask this question. Google Webmaster Tools has recently identified a ton of "Not Found" pages, which are actual pages with some digits appended at the end. For example, suppose an actual page on my blog is: (A) http://www.example.com/blog/2012/09/my-post-title/ This page works just fine. However, GWT has identified the following page as a "not found" page: (B) http://www.example.com/blog/2012/09/my-post-title/9157586677/1846732913010 This appears to be happening to hundreds of posts on my site. In each case, the "9157586677" portion of the URL is identical, but the remaining 13 digits change from page to page. I haven't been able to determine exactly what is causing this to happen - it's probably a social plug-in for Wordpress, or perhaps Disqus, but I'm not sure which one. I'll go through a process of elimination to narrow it down over the coming week. As a quick fix, I'd like to create a ModRewrite rule so that requests for (B) get 301 redirected to (A). Since there are hundreds of posts, I need to do this in a way that works regardless of what's in the "/2012/09/my-post-title/" part of the URL. Unfortunately, mod-rewrite is outside of my area of expertise. Can somebody please suggest how I can handle this? Thanks in advance. PS - As for tracking down the cause, I've looked at the source of the pages in the "Linked From" area of GWT and the Not Found link is nowhere to be found. That is why I assume the bad link is being generated by some javascript that is a part of one of my plug-ins. Update: It seems like Disqus is the source of these phantom links. There's considerable discussion here. I'll continue searching for a long-term solution. Meanwhile, I'd still appreciate help with the mod-rewrite question above. Thanks again.
Technical SEO | | ahirai0 -
Different links to to the same page
Hi, Based on the user's actions we post activity into users Facebook timeline. And each activity has link back to our particular page on our website. For example if original page was: www.Domain.com from Facebook timeline it would be like this: www.Domain.com?Ffb_action_ids=101508953168 Do you think this will have a negative effect on our page rankings as we will eded up having a lot of different URL's to the same page? www.Domain.com?Ffb_action_ids=101508953168 www.Domain.com?Ffb_action_ids=456788765609 etc.. Thank you, Karen Bdoyan
Technical SEO | | showme0