Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
PDF web traffic hitting our site
Hi there, Over the last few months our traffic has spiked due to irrelevant pdf documents sending us crap traffic, our bounce rate is sky high as well as other metrics. I don't want to just filter out this traffic in GA rather try and stop our site from being attacked. Any advice on a way forward would be great. Thanks
Reporting & Analytics | | ICMPmarketing0 -
Multiple Pages get rank for one query
Hi to all experts, In google search console , I've found that multiple pages get rank for my main query. How can I solve that? aoi59
Reporting & Analytics | | tarahshiiid0 -
Site Hacked: Is it Faster and Better to 301 or 404 Irrelevant URLs?
Hey Everyone, So our site was hacked which created a large amount of irrelevant URLs on our domain; resulting in thousands of 404 errors and pages coming up for searches unrelated to our brand. The question is now that the issues have been resolved (and site re-submitted) would it be quicker (and more ideal) to redirect important 404 errors that see traffic, have links…etc. although not relevant or just let everything 404 out? We’re not as concerned with offering a relevant user experience because these are not in our demographic but want to avoid these pages convoluting our analytics as well as issues that might arise from Google thinking these topics do apply. Any help or insight would be very appreciated. Please let us know if you have any questions, concerns or we could provide further details that might help. Looking forward to hearing from all of you! Thanks in advance. Best,
Reporting & Analytics | | Ben-R0 -
Launching a new site
What is the best method for Google Analytics implementation? Should I use the same UA id for the new site, or create an new one for the new site?
Reporting & Analytics | | brianvest0 -
Wordpress SEO vs Regular Site SEO
Hey Mozzers I'm building a Wordpress-powered site (self hosted on different domain). I know there are different plug-ins and whatnot for Wordpress SEO, but what exactly am I getting myself into? Am I required to use these plug-ins even if I already know how to do regular SEO on-page coding, or are they mainly dumbed-down tools for mom-bloggers to use? Am I still able to use Google Analytics as I am with a regular site?
Reporting & Analytics | | Travis-W
What else is there to think about? Thanks!0 -
When one of my sites returns a ranking that consistently reads "No Data", what does that say about the site?
I am getting "No Data" reads for some of my sites - I personally think it has to do with the site's construction - especially the landing page... I inherited this site to do SEO - it was not created with on site SEO in mind - please help if you can sites are: www.storagesanangelo.com www.storagemidland.com Should I get webmaster to remove the big map graphic and add text and pics instead... Sure appreciate brilliant thoughts - even about yetis and beer
Reporting & Analytics | | creativeguy0 -
An initial set up question - linking to GA
when viewing campaign it indicates there is a problem linking to our GA account but the details entered are correct. Any ideas what the problem might be - nothing on the campaign settings page gives me a clue. There is a highlighted box saying disconnect from GA but nothing else. The only thing i can think of is that the GA account was set up by a third party and we have access to view it - could it be a permissions problem?
Reporting & Analytics | | GreatGifts4Kids0 -
Email campaigns. Should I link to my blog or to my site?
I have a client for who we write and post a daily blog article. The articles are optimized and linked to particular targeted content on his top level site. Now we are going to start e-marketing to his 3000+ website users to announce inventory changes and specials. My question is (from a SE standpoint) are we better off linking the e-mail content to the blog and introducing people to the blog (but adding an additional step for getting to the new inventory. Or are we better off putting a link in the HTML E-mail letter that we send out to both the blog and separately to the inventory section? Just to clarify, we wonder if the search engines would provide some additional authority for the extra blog traffic and thereby build the overall score of the blog & site. We are looking at the e-mail campaigns as a potential opportunity to impact SE scores not just awareness of new inventory. Thanks everyone!
Reporting & Analytics | | webindustry0