Page not being indexed or crawled and no idea why!
-
Hi everyone,
There are a few pages on our website that aren't being indexed right now on Google and I'm not quite sure why. A little background:
We are an IT training and management training company and we have locations/classrooms around the US. To better our search rankings and overall visibility, we made some changes to the on page content, URL structure, etc. Let's take our Washington DC location for example. The old address was:
http://www2.learningtree.com/htfu/location.aspx?id=uswd44
And the new one is:
http://www2.learningtree.com/htfu/uswd44/reston/it-and-management-training
All of the SEO changes aren't live yet, so just bear with me. My question really regards why the first URL is still being indexed and crawled and showing fine in the search results and the second one (which we want to show) is not. Changes have been live for around a month now - plenty of time to at least be indexed.
In fact, we don't want the first URL to be showing anymore, we'd like the second URL type to be showing across the board. Also, when I type into Google site:http://www2.learningtree.com/htfu/uswd44/reston/it-and-management-training I'm getting a message that Google can't read the page because of the robots.txt file. But, we have no robots.txt file. I've been told by our web guys that the two pages are exactly the same. I was also told that we've put in an order to have all those old links 301 redirected to the new ones. But still, I'm perplexed as to why these pages are not being indexed or crawled - even manually submitted it into Webmaster tools.
So, why is Google still recognizing the old URLs and why are they still showing in the index/search results?
And, why is Google saying "A description for this result is not available because of this site's robots.txt"
Thanks in advance!
- Pedram
-
Hi Mike,
Thanks for the reply. I'm out of the country right now, so reply might be somewhat slow.
Yes, we have links to the pages on our sitemaps and I have done fetch requests. I did a check now and it seems that the niched "New York" page is being crawled now. Might have been a time issue as you suggested. But, our DC page still isn't being crawled. I'll check up on it periodically and see the progress. I really appreciate your suggestions - it's already helping. Thank you!
-
It possibly just hasn't been long enough for the spiders to re-crawl everything yet. Have you done a fetch request in Webmaster Tools for the page and/or site to see if you can jumpstart things a little? Its also possible that the spiders haven't found a path to it yet. Do you have enough (or any) pages linking into that second page that isn't being indexed yet?
-
Hi Mike,
As a follow up, I forwarded your suggestions to our Webmasters. The adjusted the robots.txt and now reads this, which I think still might cause issues and am not 100% sure why this is:
User-agent: * Allow: /htfu/ Disallow: /htfu/app_data/ Disallow: /htfu/bin/ Disallow: /htfu/PrecompiledApp.config Disallow: /htfu/web.config Disallow: / Now, this page is being indexed: http://www2.learningtree.com/htfu/uswd74/alexandria/it-and-management-training But, a more niched page still isn't being indexed: http://www2.learningtree.com/htfu/usny27/new-york/sharepoint-training Suggestions?
-
The pages in question don't have any Meta Robots Tags on them. So once the Disallow in Robots.txt is gone and you do a fetch request in Webmaster Tools, the page should get crawled and indexed fine. If you don't have a Meta Robots Tag, the spiders consider it Index,Follow. Personally I prefer to include the index, follow tag anyway even if it isn't 100% necessary.
-
Thanks, Mike. That was incredibly helpful. See, I did click the link on the SERP when I did the "site" search on Google, but I was thinking it was a mistake. Are you able to see the disallow robot on the source code?
-
Your Robots.txt (which can be found at http://www2.learningtree.com/robots.txt) does in fact have Disallow: /htfu/ which would be blocking http://www2.learningtree.com**/htfu/**uswd44/reston/it-and-management-training from being crawled. While your old page is also technically blocked, it has been around longer and would already have been cached so will still appear in the SERPs.... the bots just won't be able to see changes made to it because they can't crawl it.
You need to fix the disallow so the bots can crawl your site correctly and you should 301 your old page to the new one.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving Pages Up a Folder to come off root domain
Good Morning I've been doing some competitor research to see why they're ranking higher than us and noticed that one who seems to be doing well has changed their url structure so that rather than being www.domain.com/product-category/product-subcategory/product-info-page/ they've removed levels so for instance they now have: www.domain.com/product-subcategory/ and www.domain.com/product-info-page/ basically everything seems to come off the root domain rather than having the traditional structure. Our rankings for the product-subcategory pages, which are probably what most people would search for, are just sitting below the first page in most instances and have been for a while I'm interested to know other people's thoughts and if this is an approach they've taken and had good results?
White Hat / Black Hat SEO | | Ham19790 -
[SEO] Star Ratings -> Review -> Category Page
Hello there, Basically, if you put non-natural star ratings on the category page, like in the attached images, you will get manual ban from google right?
White Hat / Black Hat SEO | | Shanaki
(i know it for sure, cause I had clients with this situation) The real question is:
If I put a form that allows users to write a review about the category products on the category page, for REAL, will google still ban? Any advice? Any example? With respect,
Andrei Irh0O kto4o0 -
Schema Markup for regular web pages?
I'm a bit confused about what Schema markup should be applied to such regular, informative web pages.
White Hat / Black Hat SEO | | gray_jedi
We have a few pages describing our technology and solutions. These pages are not products or news articles. And they are not something that should be reviewed/rated. What Schema markup should be used for a standard run-of-the mill web page?
Is there a good reference / tutorial for optimizing the schema markup of an informational website? Any advice is much appreciated, thank you!0 -
[linkbuilding] link partner page on webshop, is it working?
Hello Mozzers, I am wondering about the effect of link building by swapping links between websites and adding a link partner page to the web shop containing hundreds of links. I have this new competitor coming in to the SERP of Google competing on the keywords I am targeting. The competitor has way more links than our web shop. The competitor has a page with hundreds of links to other web shops witch on there turn has a link to there web shop. (not all off them link back btw) I always thought it is no use sharing links with other websites this way in creating a huge page with hundreds of links. it is of no benefit for neighter website to do this. Still it does seems to work (?) and tis strategy is used by a lot of web shops in the Netherlands. How are you guys looking at this?
White Hat / Black Hat SEO | | auke1810
Witch of you guy's are using strategy like this?
Should I pick up this strategy myself?0 -
Opinions sought on outbound Links page.
Hello Forum, I'm about the remove my outbound Links page at: http://www.pictureframe.com.au/---obs--picture-frames-links.html I think that Google could be assessing this page as a link scheme, ie: I-link-you-if-you-link me. I haven't received any messages from Google about this but I think the page may be devaluing my site. What do you guys~gals think? Thank you for any and all feedback Paul the Picture Framer
White Hat / Black Hat SEO | | Picframer0 -
How to rank internal pages?
Hello, I have a website about consoles, on the homepage are a few thoughts about what consoles are and a short history. The main attraction are the pages about Xbox 360, PlayStation 3, Nintendo Wii, PSP Vita. So, I want to rank my homepage and my internal pages about the consoles ranking for "xbox360", "play station 3" each one on a separate page of course. Basically I want to rank brands. My main questions are: 1. How much link builing should I do for my homepage considering that I'm not really interested in ranking it as much as the internal pages? In percentage how it would look like? Random (stupid) example: 60% links to homepage, 10% to each internal page? 2. I guess I must do links for internal pages otherwise they won't rank good, only linking to homepage. 3. Considering the penguin update, my main keyword should be around what % of the overall anchors to each internal page? Thank you very much for your help!
White Hat / Black Hat SEO | | corodan0 -
Page Rank is 0
Hi. Can you please point me in the right direction concerning a site whose default page has a PR of 0? There does not appear to be any errors in the robots.txt file (that I can tell). When I ran a duplicate content check by searching the title tag and first sentance in quotes it did not return more than 2 sites. When I ran a site: it is reporting 287,000 results. Does this mean that they purchased links and have now been penalized? Or where should I go from here? Thank you for any feedback and assistance.
White Hat / Black Hat SEO | | JulB0 -
A domain is ranking for a plural key word in SERPs on page 1 but for the singular not at all?
What could the reasons that a domain is ranking for the plural version of a key word on SERPs page 1 and for the singular version not at all? Google knows that both key words belong together, as in the SERPs for one version also the other version of the key word is being highlighted. If I search for the domain with the plural keyword it shows up on the first page in SERPs, but If I search for the same keyword as singular (in German it is just removing an “s”) I see the plural version highlighted many times but I cannot find my domain. What could be the reason for this behavior? penalties?
White Hat / Black Hat SEO | | SimCaffe0