Google indexing less url's then containded in my sitemap.xml
-
My sitemap.xml contains 3821 urls but Google (webmaster tools) indexes only 1544 urls. What may be the cause? There is no technical problem. Why does Google index less URLs then contained in my sitemap.xml?
-
Thank you for helping
-
Unless you have a SEO actively reviewing your site, it is quite normal for Google to index less pages then are offered in your sitemap.
How exactly was your sitemap created? Did you go by hand through your site's 3281 pages and add them to a sitemap? Or more likely, did you use a tool to create the sitemap? If you used a tool, how much knowledge do you have regarding how this tool works or its settings?
Just a few examples of URLs which may be included in your sitemap that Google would likely not index:
-
Your home page and other pages may have multiple URLs which lead to the same page. For example: www.mysite.com and www.mysite.com/index.html may be two URLs for the same page. Google will likely only index one of them.
-
You may have links to various URLs which contain parameters which Google will reduce to a single URL. For example: www.mysite.com/product_id=308&sort=asc&color=black, and another URL www.mysite.com/product_id=308&sort=desc&color=black. Both URLs lead to the same content sorted differently.
-
You may have duplicate content on your site. For example, you can sell chairs and list the same chair under multiple paths such as /furniture/wood/chair123 and /furniture/dining-room/chair123. Google will recognize these two pages are the same content presented under multiple URLs.
-
You may have submitted pages to your sitemap which are blocked via robots.txt or the "noindex" tag or are canonicalized to another page.
In order to better understand the root issue you need to examine a list of all URLs in your sitemap and compare that to a list of all indexed URLs. Determine which URLs Google has not indexed and research the reason for each one independently.
-
-
Are they index worthy?
Having them on your sitemap does not mean google wants them in its index
-
He just said it. Is this a new domain? Im in the same boat as you for some of my domains.
-
Yes, I understand this. But In this situation Google first indexes all the URL's within my sitemap.xml uploaded in Google Webmaster tools. Now Google indexes less URL's, only 50%. What can be the cause if there are no technical problems?
-
Hi!
Google will only spend 'so much time' on any new domain. The more traffic and links and page authority you get, the more time Google will dedicate to crawling your website. You should also make sure that the site is not slow, as this will reduce the crawling speed even more! See Google page speed for tips on speeding up the load time of your site
Good Luck,
Sven Witteveen
Expand Online
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why google removed my landing pages from index?
I made new website meko.lv. I put many work to it, to make page SEO friendly, sprites, reduced requests added SSL, got google page speed insights score 100/100, but in 2. october all pages in google webmasters disappeared from index. Could you please look at website and say whats wrong with it? They are all search results present in google but for how long. it is so annoying, you put so many work but in result get high spam score. It is obvious that new pages can not get good links in one month https://meko.lv/ google webmasters google page speed score: https://developers.google.com/speed/pagespeed/insights/?url=http%3A%2F%2Fmeko.lv%2F&tab=mobile q1LDHTn
Technical SEO | | Mekounko0 -
Some URLs in the sitemap not indexed
Our company site has hundreds of thousands of pages. Yet no matter how big or small the total page count, I have found that the "URLs Indexed" in GWMT has never matched "URLS in Sitemap". When we were small and now that we have a LOT more pages, there is always a discrepancy of ~10% or so missing from the index. It's difficult to know which pages are not indexed, but I have found some that I can verify are in the Sitemap.xml file but not at all in the index. When I go to GWMT I can "Fetch and Render" missing pages fine - it's not as though it's blocked or inaccessible. Any ideas on why this is? Is this type of discrepancy typical?
Technical SEO | | Mase0 -
Google Webmaster tools Sitemap submitted vs indexed vs Index Status
I'm having an odd error I'm trying to diagnose. Our Index Status is growing and is now up to 1,115. However when I look at Sitemaps we have 763 submitted but only 134 indexed. The submitted and indexed were virtually the same around 750 until 15 days ago when the indexed dipped dramatically. Additionally when I look under HTML improvements I only find 3 duplicate pages, and I ran screaming frog on the site and got similar results, low duplicates. Our actual content should be around 950 pages counting all the category pages. What's going on here?
Technical SEO | | K-WINTER0 -
Pages to be indexed in Google
Hi, We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages. Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone. My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate. If we need to remove what needs to be done? Robots block or Noindex/Nofollow Regards
Technical SEO | | mtthompsons0 -
Why are my URL's with a trailing slash still getting indexed even though they are redirected in the .htaccess file?
My .htaccess file is set up to redirect a URL with a trailing / to the URL without the /. However, my SEOmoz crawl diagnostics report is showing both URL's. I took a look at my Google Webmaster account and saw some duplicate META title issues. Same thing, Google Webmaster is showing the URL with the trailing /. My website was live for about 3 days before I added the code to the .htaccess file to remove the trailing /. Is it possible that in those 3 days that both versions were indexed and haven't been removed even though the .htaccess file has been updated?
Technical SEO | | mkhGT0 -
Alternatives to SEOmoz's Crawl Diagnistics
I really like SEOmoz's Crawl diagnostics reports, it goes through the pages and finds all sorts of valuable information, I wanted to know if there are any other services that compete against this specific service, to test the accuracy of their crawl diagnistics. Thanks
Technical SEO | | BestOdds0 -
Access To Client's Google Webmaster Tools
Hi, What's the best/easiest way for a client to grant access to his Google Webmaster Tools to me? Thanks! Best...Michael
Technical SEO | | 945010 -
Duplicate content and URL's
Hi Guys, Hope you are all well. Just a quick question which you will find nice and easy 🙂 I am just about to work through duplicate content pages and URL changes. Firstly, With the duplicate content issue i am finding the seo friendly URL i would normally direct to in some cases has less links, authority and root domain to it than some of the unseo friendly URL's. will this harm me if i still 301 redirect them to the seo friendly URL. Also, With the url changed it is going to be a huge job to change all the url so they are friendly and the CMS system is poor. Is there a better way of doing this? It has been suggested that we create a new webpage with a friendly URL and redirect all the pages to that. Will this lose all the weight as it will be a brand new page? Thank you for your help guys your legends!! Cheers Wayne
Technical SEO | | wazza19850