More Indexed Pages than URLs on site.
-
According to webmaster tools, the number of pages indexed by Google on my site doubled yesterday (gone from 150K to 450K). Usually I would be jumping for joy but now I have more indexed pages than actual pages on my site.
I have checked for duplicate URLs pointing to the same product page but can't see any, pagination in category pages doesn't seem to be indexed nor does parameterisation in URLs from advanced filtration.
Using the site: operator we get a different result on google.com (450K) to google.co.uk (150K).
Anyone got any ideas?
-
Hi David,
Its tough to say without some more digging and information, it certainly looks like you have most of the common problem areas covered from what I can see. I will throw out an idea: I see you have a few 301 redirects in place switching from .html to non html versions. If this was done on a massive scale then possibly you have a google index with both versions of the pages in the index? If so it might not really be a big issue and over the next weeks/months the old .html versions will fall out of the index and your numbers will begin to look more normal again, Just a thought.
-
Thanks Lynn. The 31,000 was a bit of a legacy of issue and something we have solved. The robots file was changed a couple of weeks ago. So fingers crossed Google will deindex them soon. We get the same result when using inurl: where.
Any idea where the rest have come from?
-
Hi Irving
We checked everything obvious and cannot explain what is going on. I cannot see any major duplicate content issues and we do not have any subdomains active. The Moz crawler also doesn't highlight any major duplicate content issues.
-
Hi David,
Not sure why they started showing up now (some recent changes to the site?) but I suspect your problem is indexed urls that you are trying to block with robots.txt but are finding their way into the index somehow.
If you do a search for: site:nicontrols.com inurl:/manufacturer/ and then click on the show omitted results you will see a whole bunch (31000!) of 'content blocked by robots.txt' notices but the urls are still in the index. If you do a couple more similar searches looking for other likely url paths you will likely find some more.
If you can get a no-index meta tag into these pages I think it will be more effective in keeping them out of the index. If you have in mind some recent changes you have done to the site that might have introduced internal links to these pages then it would be worth looking to see if you can get the links removed or replaced with the 'proper' link format.
Hope that helps!
-
Can you see in the search the pages which are indexed and look for duplicates or technical issues causing improper indexing? Do you have other sites like subdomains Google might be counting as pages.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Trying to get Google to stop indexing an old site!
Howdy, I have a small dilemma. We built a new site for a client, but the old site is still ranking/indexed and we can't seem to get rid of it. We setup a 301 from the old site to the new one, as we have done many times before, but even though the old site is no longer live and the hosting package has been cancelled, the old site is still indexed. (The new site is at a completely different host.) We never had access to the old site, so we weren't able to request URL removal through GSC. Any guidance on how to get rid of the old site would be very appreciated. BTW, it's been about 60 days since we took these steps. Thanks, Kirk
Intermediate & Advanced SEO | | kbates0 -
Redirecting Pages During Site Migration
Hi everyone, We are changing a website's domain name. The site architecture will stay the same, but we are renaming some pages. How do we treat redirects? I read this on Search Engine Land: The ideal way to set up your redirects is with a regex expression in the .htaccess file of your old site. The regex expression should simply swap out your domain name, or swap out HTTP for HTTPS if you are doing an SSL migration. For any pages where this isn’t possible, you will need to set up an individual redirect. Make sure this doesn’t create any conflicts with your regex and that it doesn’t produce any redirect chains. Does the above mean we are able to set up a domain redirect on the regex for pages that we are not renaming and then have individual 1:1 redirects for renamed pages in the same .htaccess file? So have both? This will not conflict with the regex rule?
Intermediate & Advanced SEO | | nhhernandez0 -
Pages Titles in SERPs - Wordpress Site
In Google SERPs we have several websites (built in wordpress) who's pages are being displayed without using the page title - is this google ignoring the page title or is there a problem in our code - also if this is google is it still taking notice of the page title to determine what content is on the page?I have read several articles on this but wondered if someone can advise - I can provide the URL if required.Also I wanted to 100% that our robots.txt is behaving its self.
Intermediate & Advanced SEO | | JohnW-UK0 -
Can we retrieve all 404 pages of my site?
Hi, Can we retrieve all 404 pages of my site? is there any syntax i can use in Google search to list just pages that give 404? Tool/Site that can scan all pages in Google Index and give me this report. Thanks
Intermediate & Advanced SEO | | mtthompsons0 -
No index.no follow certain pages
Hi, I want to stop Google et al from finding a some pages within my website. the url is www.mywebsite.com/call_backrequest.php?rid=14 As these pages are creating a lot of duplicate content issues. Would the easiest solution be to place a 'Nofollow/Noindex' META tag in page www.mywebsite.com/call_backrequest.php many thanks in advance
Intermediate & Advanced SEO | | wood1e19680 -
How do you transition a keyword rank from a home page to a sub-page on the site?
We're currently ranking #1 for a valuable keyword, but the result on the SERP is our home page. We're creating a new product page focused on this keyword to provide a better user experience and create more relevant content. What is the best way to make a smooth transition to make the product page rank #1 for the keyword instead of the home page?
Intermediate & Advanced SEO | | buildasign0 -
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash.
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash. Should we rewrite to just the trailing slash or without because of duplicates. The other question is, if we do a rewrite, google has indexed some pages with the slash and some without - i am assuming we will lose rank for one of them once we do the rewrite, correct?
Intermediate & Advanced SEO | | Profero0 -
We are changing ?page= dynamic url's to /page/ static urls. Will this hurt the progress we have made with the pages using dynamic addresses?
Question about changing url from dynamic to static to improve SEO but concern about hurting progress made so far.
Intermediate & Advanced SEO | | h3counsel0