Indexed Pages in Google, How do I find Out?
-
Is there a way to get a list of pages that google has indexed?
Is there some software that can do this?
I do not have access to webmaster tools, so hoping there is another way to do this.
Would be great if I could also see if the indexed page is a 404 or other
Thanks for your help, sorry if its basic question
-
If you want to find all your indexed pages in Google just type: site:yourdomain.com or .co.uk or other without the www.
-
Hi John,
Hope I'm not too late to the party! When checking URL's for their cache status I suggest using Scrapebox (with proxies).
Be warned, it was created as a black-hat tool, and as such is frowned upon, but there are a number of excellent white-hat uses for it! Costs $57 one off
-
sorry to keep sending you messages but I wanted to make sure that you know SEOmoz does have a fantastic tool for what you are requesting. Please look at this link and then click on the bottom where it should says show more and I believe you will agree it does everything you've asked and more.
http://pro.seomoz.org/tools/crawl-test
Sincerely,
Thomas
does this answer your question?
-
What giving you a 100 limit?
try using Raven tools or spider mate they both have excellent free trials and allow you quite a bit of information.
-
Neil you are correct I agree with screaming frog is excellent they definitely will show you your site. Here is a link from SEOmoz associate that I believe will benefit you
http://www.seomoz.org/q/404-error-but-i-can-t-find-any-broken-links-on-the-referrer-pages
sincerely,
Thomas
-
this is what I am looking for Thanks
Strange that there is no tool I can buy to do this in full without the 100 limit
Anyway, i will give that a go
-
can I get your sites URL? By the way this might be a better way into Google Webmaster tools
if you have a Gmail account use that if you don't just sign up using your regular e-mail.
Of course using SEOmoz via http://pro.seomoz.org/tools/crawl-test will give you a full rundown of all of your links and how they're running. Are you not seen all of them?
Another tool I have found very useful. Is website analysis as well as their midsize product from Alexia
I hope I have helped,
Tom
-
If you don't have access to Webmaster Tools, the most basic way to see which pages Google has indexed is obviously to do a site: search on Google itself - like "site:google.com" - to return pages of SERPs containing the pages from your site which Google has indexed.
Problem is, how do you get the data from those SERPs in a useful format to run through Screaming Frog or similar?
Enter Chris Le's Google Scraper for Google Docs
It will let scrape the first 100 results, then let you offset your search by 100 and get the next 100, etc.. slightly cumbersome, but it will achieve what you want to do.
Then you can crawl the URLs using Screaming Frog or another crawler.
-
just thought I might add these links these might help explain it better than I did.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1352276
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=2409443&topic=2446029&ctx=topic
http://pro.seomoz.org/tools/crawl-test
you should definitely sign up for Google Webmaster tools it is free here is a link all you need to do is add an e-mail address and password
http://support.google.com/webmasters/bin/topic.py?hl=en&topic=1724121
I hope I have been of help to you sincerely,
Thomas
-
Thanks for the reply.
I do not have access to webmaster tools and the seomoz tools do not show a great deal of the pages on my site for some reason
Majestic shows up to 100 pages. Ahrefs shows some also.
I need to compare what google has indexed and the status of the page
Does screaming frog do thiss?
-
Google Webmaster tools should supply you with this information. In addition Seomoz tools will tell you that and more. Run your website through the campaign section of seomoz you will then see any issues with your website.
You may also want to of course use Google Webmaster tools run a test as a Google bot the Google but should show you any issues you are having such is 404's or other fun things that websites do.
If you're running WordPress there are plenty of plug-ins I recommend 404 returned
sincerely,
Thomas
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawling/indexing of near duplicate product pages
Hi, Hope someone can help me out here. This is the current situation: We sell stones/gravel/sand/pebbles etc. for gardens. I will take a type of pebbles and the corresponding pages/URL's to illustrate my question --> black beach pebbles. We have a 'top' product page for black beach pebbles on which you can find different types of quantities (differing from 20kg untill 1600 kg). There is not any search volume related to the different quantities The 'top' page does not link to the pages for the different quantities The content on the pages for the different quantities is not exactly the same (different price + slightly different content). But a lot of the content is the same. Current situation:
Intermediate & Advanced SEO | | AMAGARD
- Most pages for the different quantities do not have internal links (about 95%) But the sitemap does contain all of these pages. Because the sitemap contains all these URL's, google frequently crawls them (I checked the logfiles) and has indexed them. Problems: Google spends its time crawling irrelevant pages --> our entire website is not that big, so these quantity URL's kind of double the total number of URL's. Having url's in the sitemap that do not have an internal link is a problem on its own All these pages are indexed so all sorts of gravel/pebbles have near duplicates. My solution: remove these URL's from the sitemap --> that will probably stop Google from regularly crawling these pages Putting a canonical on the quantity pages pointing to the top-product page. --> that will hopefully remove the irrelevant (no search volume) near duplicates from the index My questions: To be able to see the canonical, google will need to crawl these pages. Will google still do that after removing them from the sitemap? Do you agree that these pages are near duplicates and that it is best to remove them from the index? A few of these quantity pages do have intenral links (a few procent of them) because of a sale campaign. So there will be some (not much) internal links pointing to non-canonical pages. Would that be a problem? Thanks a lot in advance for your help! Best!1 -
Delete page and tell it to Google
Hello everybody, i have a problem with some pages of my website. I have had to removed 5-10 pages because these pages linked to 404 pages and i removed it. Need i to tell to Google or Only removed? Thanks so much
Intermediate & Advanced SEO | | pompero990 -
Is it a problem that Google's index shows paginated page urls, even with canonical tags in place?
Since Google shows more pages indexed than makes sense, I used Google's API and some other means to get everything Google has in its index for a site I'm working on. The results bring up a couple of oddities. It shows a lot of urls to the same page, but with different tracking code.The url with tracking code always follows a question mark and could look like: http://www.MozExampleURL.com?tracking-example http://www.MozExampleURL.com?another-tracking-examle http://www.MozExampleURL.com?tracking-example-3 etc So, the only thing that distinguishes one url from the next is a tracking url. On these pages, canonical tags are in place as: <link rel="canonical<a class="attribute-value">l</a>" href="http://www.MozExampleURL.com" /> So, why does the index have urls that are only different in terms of tracking urls? I would think it would ignore everything, starting with the question mark. The index also shows paginated pages. I would think it should show the one canonical url and leave it at that. Is this a problem about which something should be done? Best... Darcy
Intermediate & Advanced SEO | | 945010 -
How long takes to a page show up in Google results after removing noindex from a page?
Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks.
Intermediate & Advanced SEO | | fabioricotta-840380 -
Language Subdirectory homepage not indexed by Google
Hi mozzers, Our Spanish homepage doesn't seem to be indexed or cached in Google, despite being online for over a month or two. All Spanish subpages are indexed and have started to rank but not the homepage. I have submitted sitemap xml to GWTools and have checked there's no noindex on the page - it seems to be in order. And when I run site: command in Google it shows all pages except homepage. What could be the problem? Here's the page: http://www.bosphorusyacht.com/es/
Intermediate & Advanced SEO | | emerald0 -
Why isn't google indexing our site?
Hi, We have majorly redesigned our site. Is is not a big site it is a SaaS site so has the typical structure, Landing, Features, Pricing, Sign Up, Contact Us etc... The main part of the site is after login so out of google's reach. Since the new release a month ago, google has indexed some pages, mainly the blog, which is brand new, it has reindexed a few of the original pages I am guessing this as if I click cached on a site: search it shows the new site. All new pages (of which there are 2) are totally missed. One is HTTP and one HTTPS, does HTTPS make a difference. I have submitted the site via webmaster tools and it says "URL and linked pages submitted to index" but a site: search doesn't bring all the pages? What is going on here please? What are we missing? We just want google to recognise the old site has gone and ALL the new site is here ready and waiting for it. Thanks Andrew
Intermediate & Advanced SEO | | Studio330 -
Volusion store product pages will not index
Hello, I have moved over to Volusion and was wondering if you guys know of any SEO practices that are Volusion specific. i have been working on this site now for 2 months and my impressions and rankings have dropped substantially My 301 redirects where in place before I flipped over and my keywords / titles/ tags etc.. are in place. However i am still not making any progress in the engines. I have noticed that my products are not being indexed per Webmaster tools. I have heard that volusion has something set up to where you must purchase their SEO package in order to rank. I am really at my wits end and currently I thinking about taking a loss and reverting back to my old Shoppe Pro site. Any help would be very appreciated
Intermediate & Advanced SEO | | kerry0217
.0 -
How long till pages drop out of the index
In your experience how long does it normally take for 301-redirected pages to drop out of Google's index?
Intermediate & Advanced SEO | | bjalc20110