Page not being indexed or crawled and no idea why!
-
Hi everyone,
There are a few pages on our website that aren't being indexed right now on Google and I'm not quite sure why. A little background:
We are an IT training and management training company and we have locations/classrooms around the US. To better our search rankings and overall visibility, we made some changes to the on page content, URL structure, etc. Let's take our Washington DC location for example. The old address was:
http://www2.learningtree.com/htfu/location.aspx?id=uswd44
And the new one is:
http://www2.learningtree.com/htfu/uswd44/reston/it-and-management-training
All of the SEO changes aren't live yet, so just bear with me. My question really regards why the first URL is still being indexed and crawled and showing fine in the search results and the second one (which we want to show) is not. Changes have been live for around a month now - plenty of time to at least be indexed.
In fact, we don't want the first URL to be showing anymore, we'd like the second URL type to be showing across the board. Also, when I type into Google site:http://www2.learningtree.com/htfu/uswd44/reston/it-and-management-training I'm getting a message that Google can't read the page because of the robots.txt file. But, we have no robots.txt file. I've been told by our web guys that the two pages are exactly the same. I was also told that we've put in an order to have all those old links 301 redirected to the new ones. But still, I'm perplexed as to why these pages are not being indexed or crawled - even manually submitted it into Webmaster tools.
So, why is Google still recognizing the old URLs and why are they still showing in the index/search results?
And, why is Google saying "A description for this result is not available because of this site's robots.txt"
Thanks in advance!
- Pedram
-
Hi Mike,
Thanks for the reply. I'm out of the country right now, so reply might be somewhat slow.
Yes, we have links to the pages on our sitemaps and I have done fetch requests. I did a check now and it seems that the niched "New York" page is being crawled now. Might have been a time issue as you suggested. But, our DC page still isn't being crawled. I'll check up on it periodically and see the progress. I really appreciate your suggestions - it's already helping. Thank you!
-
It possibly just hasn't been long enough for the spiders to re-crawl everything yet. Have you done a fetch request in Webmaster Tools for the page and/or site to see if you can jumpstart things a little? Its also possible that the spiders haven't found a path to it yet. Do you have enough (or any) pages linking into that second page that isn't being indexed yet?
-
Hi Mike,
As a follow up, I forwarded your suggestions to our Webmasters. The adjusted the robots.txt and now reads this, which I think still might cause issues and am not 100% sure why this is:
User-agent: * Allow: /htfu/ Disallow: /htfu/app_data/ Disallow: /htfu/bin/ Disallow: /htfu/PrecompiledApp.config Disallow: /htfu/web.config Disallow: / Now, this page is being indexed: http://www2.learningtree.com/htfu/uswd74/alexandria/it-and-management-training But, a more niched page still isn't being indexed: http://www2.learningtree.com/htfu/usny27/new-york/sharepoint-training Suggestions?
-
The pages in question don't have any Meta Robots Tags on them. So once the Disallow in Robots.txt is gone and you do a fetch request in Webmaster Tools, the page should get crawled and indexed fine. If you don't have a Meta Robots Tag, the spiders consider it Index,Follow. Personally I prefer to include the index, follow tag anyway even if it isn't 100% necessary.
-
Thanks, Mike. That was incredibly helpful. See, I did click the link on the SERP when I did the "site" search on Google, but I was thinking it was a mistake. Are you able to see the disallow robot on the source code?
-
Your Robots.txt (which can be found at http://www2.learningtree.com/robots.txt) does in fact have Disallow: /htfu/ which would be blocking http://www2.learningtree.com**/htfu/**uswd44/reston/it-and-management-training from being crawled. While your old page is also technically blocked, it has been around longer and would already have been cached so will still appear in the SERPs.... the bots just won't be able to see changes made to it because they can't crawl it.
You need to fix the disallow so the bots can crawl your site correctly and you should 301 your old page to the new one.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do home page carry more seo benefit than other pages?
hi, i would like to include my kws in the URL and they are under 50 characters. is there anything in the algo that tells engines to give more importance to homepage?
White Hat / Black Hat SEO | | alan-shultis0 -
My indexed site URL removed from google search without get any message or Manual Actions???
On Agust 2 or 3.. I'm not sure about the exact date...
White Hat / Black Hat SEO | | newwaves
The main URL of my website https://new-waves.net/ had been completely removed from Google search results! without getting any messages or Manual Actions on search console ?? but I'm still can find some of my site subpages in search results and on Google local maps results when I tried to check it on google
info:new-waves.net >> no results
site:new-waves.net >> only now I can see the main URL in results because I had submitted it again and again to google but it might be deleted again today or tomorrow as that happen before last few days
100% of all ranked keywords >> my site URL new-waves.net had been completely removed from all results! but I'm still can see it on maps on some results I never get any penalties to my site on Google search console. I noticed some drops on some keywords before that happens (in June and July) but it all of it was related to web design keywords for local Qatar, but all other keywords that related to SEO and digital marketing were not have any changes and been on top My site was ranked number 1 on google search results for "digital marketing qatar" and some other keywords, but the main URL had been removed from 100% of all search results. but you can still see it on the map only. I just tried to submit it again to Google and to index it through google search console tool but still not get any results, Recently, based on google console, I found some new links but I have no idea how it been added to links of my website:
essay-writing-hub.com - 9,710
tiverton-market.co.uk - 252
facianohaircare.com - 48
prothemes.biz - 44
worldone.pw - 2
slashdot.org - 1
onwebmarketing.com - 1 the problem is that all my high PR real links deleted from google console as well although it still have my site link and it could be recognized by MOZ and other sites! Can any one help to know what is the reason?? and how can I solve this issue without losing my previous ranked keywords? Can I submit a direct message to google support or customer service to know the reason or get help on this issue? Thanks & Regards0 -
Competitor Drops 10,000 links since last index. Lets play detective.
One of the intriguing things about SEO is being able to reverse engineer your competitors rankings because all the technical information is available for those who know where to look. I recently looked at my Dashboard and saw that one of my competitors had dropped 10,000 links. The questions is why? Google algorthm change? Blackhat Penalty? Something else.? Here are the numbers, I am going to lieave my own clients site out because his numbers are pathetic. www.Leafly(dot)com 50.4k Links Down 10k www.thcfinder(dot)com 1,530 links Down 71 www.weedmaps(dot)com 64,000k links Up 1.5K Is it just me or is that a lot of links to loose over one indexing period?
White Hat / Black Hat SEO | | DavidMeshah0 -
Redirecting from https to http - will pass whole link juice to new http website pages?
Hi making permanent 301 redirection from https to http - will pass whole link juice to new http website pages?
White Hat / Black Hat SEO | | Aman_1230 -
Unique page URLs and SEO titles
www.heartwavemedia.com / Wordpress / All in One SEO pack I understand Google values unique titles and content but I'm unclear as to the difference between changing the page url slug and the seo title. For example: I have an about page with the url "www.heartwavemedia.com/about" and the SEO title San Francisco Video Production | Heartwave Media | About I've noticed some of my competitors using url structures more like "www.competitor.com/san-francisco-video-production-about" Would it be wise to follow their lead? Will my landing page rank higher if each subsequent page uses similar keyword packed, long tail url? Or is that considered black hat? If advisable, would a url structure that includes "san-francisco-video-production-_____" be seen as being to similar even if it varies by one word at the end? Furthermore, will I be penalized for using similar SEO descriptions ie. "San Francisco Video Production | Heartwave Media | Portfolio" and San Francisco Video Production | Heartwave Media | Contact" or is the difference of one word "portfolio" and "contact" sufficient to read as unique? Finally...am I making any sense? Any and all thoughts appreciated...
White Hat / Black Hat SEO | | keeot0 -
Help with E-Commerce Product Pages
Hi, I need to find the best way to put my products on our e-commerce website. I have researched and researched but I thought I'd gather a range of ideas in here. Basically I have the following fields: Product Title
White Hat / Black Hat SEO | | YNWA
Product Description
Product Short Description SEO Title
Focus Keyword(s) (this is a feature of our CMS)
Meta Description The problem we have is we have a lot of duplicate content eg. 10 Armani Polos but then each one will be a different colour (but the model number is the same). I don't want to miss out on rankings because of this. What would you say is the best way to do this? My idea is this: Product Title: Armani Jeans Polo Shirt Blue
Product Description: Armani Jeans Polo Shirt in Blue Made from 100% cotton Armani Jeans Polo with Short Sleeves, Pique Collar and Button Up Collar. Designer Boutique Menswear are official stockists of Armani Jeans Polos.
Short Description: Blue Armani Jeans Polo SEO Title: Armani Jeans Polo Shirt Blue MA001 | Designer Boutique Menswear
Focus Keywords: Armani Jeans Polo Shirt
Meta Description: Blue Armani Jeans Polo Shirt. Made from 100% cotton. Designer Boutique Menswear are official stockists of Armani Polos. What are peoples thoughts on this? I would then run the same format across each of the different colours. Another question is on the product title and seo title, should these be exactly the same? And does it matter if I put the colour at the beginning or end of the title? Any help would be great.0 -
Linking my pages
Hello everybody, i have a small dilemma and i am not shore what to do. I am (my company) the owner of 10 e-commerce web sites. On every site i have a link too the other 9 sites and i am using an exact keyvoerd (not the shop name).Since the web stores are big and have over a 1000 pages, this means thet all my sites have a lot off inbound links (compared with my competiton). I am woried that linking them all together could be bad from Googles point of wiev. Can this couse a problem for me, should i shange it? Regardes, Marko
White Hat / Black Hat SEO | | Spletnafuzija0 -
Index page de-indexed / banned ?
Yesterday google removed our index page from the results. Today they also removed language subdomains (fr.domain.com).. Index page, subdomains are not indexed anymore. Any suggestions? -- No messages in GWT. No malware. Backlink diversification was started in May. Never penguilized or pandalized. Last week had the record of all times of daily UV. Other pages still indexed and driving traffic, left around 40% of total. Never used any black SEO tool. 95% of backlinks are related; sidebar, footer links No changes made of index page for couple months.
White Hat / Black Hat SEO | | bele0