3,511 Pages Indexed and 3,331 Pages Blocked by Robots

PeaSoupDigital

Morning,

So I checked our site's index status on WMT, and I'm being told that Google is indexing 3,511 pages and the robots are blocking 3,331. This seems slightly odd as we're only disallowing 24 pages on the robots.txt file. In light of this, I have the following queries:

Do these figures mean that Google is indexing 3,511 pages and blocking 3,331 other pages? Or does it mean that it's blocking 3,331 pages of the 3,511 indexed?
As there are only 24 URLs being disallowed on robots.text, why are 3,331 pages being blocked? Will these be variations of the URLs we've submitted?
Currently, we don't have a sitemap. I know, I know, it's pretty unforgivable but the old one didn't really work and the developers are working on the new one. Once submitted, will this help?
I think I know the answer to this, but is there any way to ascertain which pages are being blocked?

Thanks in advance!

Lewis

PeaSoupDigital

Hi,

No more links than a standard e-commerce site should have...

I'm chasing the sitemap as we speak.

Cheers,

MonicaOConnor

The blocked URLs are probably no follow links throughout the site. Do you have a lot of links pointing outward from pages?

Google is indexing 3511 pages, of which 3331 are blocked by Robots. I would check some of the internal/external links on those disallowed pages. I don't see how it could come up to 3331 blocked pages, but it couldn't hurt to start there.

Definitely get a sitemap submitted asap. It will help for sure.

Whittie

Excuse the short reply.

Add sitemap to your robots.txt - And submit it to Google WMT.

Just use a free one if you're in the middle of developing?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

3,511 Pages Indexed and 3,331 Pages Blocked by Robots

Browse Questions

Explore more categories

Related Questions

How to de-index a page with a search string with the structure domain.com/?"spam"

I am using All-in-One-seo. I change the title and meta description on the home page, but it is not showing up on the search. It is on the source code. When I change other pages, the both show up in the search, just not the home page. Any idea why?.

Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?

How to know how much pages are indexed on Google?

Unnecessary pages getting indexed in Google for my blog

OK to block /js/ folder using robots.txt?

SEOMoz Crawl Diagnostic indicates duplicate page content for home page?

Search Engine blocked by robots.txt