Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Speed or Site Speed which one does Google considered a ranking signal
I've read many threads online which proves that website speed is a ranking factor. There's a friend whose website scores 44 (slow metric score) on Google Pagespeed Insights. Despite that his website is slow, he outranks me on Google search results. It confuses me that I optimized my website for speed, but my competitor's slow site outperforms me. On Six9ja.com, I did amazing work by getting my target score which is 100 (fast metric score) on Google Pagespeed Insights. Coming to my Google search console tool, they have shown that some of my pages have average scores, while some have slow scores. Google search console tool proves me wrong that none of my pages are fast. Then where did the fast metrics went? Could it be because I added three Adsense Javascript code to all my blog posts? If so, that means that Adsense code is slowing website speed performance despite having an async tag. I tested my blog post speed and I understand that my page speed reduced by 48 due to the 3 Adsense javascript codes added to it. I got 62 (Average metric score). Now, my site speed is=100, then my page speed=62 Does this mean that Google considers page speed rather than site speed as a ranking factor? Screenshots: https://imgur.com/a/YSxSwOG **Regarding: **https://six9ja.com/
Reporting & Analytics | | Kingsmart1 -
High bounce rates consistent with a login that takes you to a 3rd party site?
My firm has a credit union client whose bounce rates skyrocketed after implementing an online banking portal. Logging in to the online banking portal takes you to a 3rd party site. Would arriving at the site and immediately logging in be considered a bounce? And if so, would a high bounce rate actually correlate with a warm reception to their online banking tool?
Reporting & Analytics | | TheKatzMeow0 -
Regular Expression Question
We are having a little trouble coming up with a goal that shows how many product pageviews we are getting but I need to exclude search results pageviews that (unfortunately) have the same URL structure. Because it's an outside CMS, we have not ability to change the URL architecture. Products are on these types of pages: https://porscheasheville.com/inventory/Porsche+Boxster+Asheville+North+Carolina+2016+Rhodium+Silver+Metallic+536911 https://porscheasheville.com/inventory/Audi+A4+2.0T+Premium+Plus+Asheville+North+Carolina+2015+Gray+638379 Search results pages have this URL structure: https://porscheasheville.com/inventory/new/ https://porscheasheville.com/inventory/?condition=new&make=Porsche&model=Boxster https://porscheasheville.com/inventory/used/ https://porscheasheville.com/inventory/?condition=used&model=A4+2.0T+Premium+Plus I am hoping to create a GA goal with regular expressions showing only the product pages and not allowing the search results pages show up. Here's what I have, it's not working - any regex experts out there who can help? /inventory/[new/][used/] Thanks as always MOZ friends!
Reporting & Analytics | | ReunionMarketing0 -
Adding a Query String to a Static URL is that good or bad?
I just went through this huge process to shorten my URL structure and remove all dynamic strings. Now my analytics team wants to add query strings to track clicks from the homepage. Is this going to destroy my clean url structure by appending a query string to the end of the URL structure.
Reporting & Analytics | | rpaiva0 -
Why am i getting a flux of increase in Impressions on my site & then it decreases
They guys. Hope everyone is having a great week. I wanted to get some inputs from you guys in regards to what is happening to my site that i quite don't understand. Every month or so i get this influx of high visibility with impressions for my keywords and then the impressions go away but my rankings still keep going up. Has anyone experienced this before and can give me some insight on what is going . Why do i get such a big jump and then it dies off only to return again a month later or 2 months later. I know you guys want probably some info from my site or from analytics or webmaster tools so i will provide as much as i can . For now i have included a screen shot. ScreenShot2013-06-04at31220PM_zps0d02f5fc.png ScreenShot2013-06-04at31134PM_zps5bb81b68.png ScreenShot2013-06-04at31134PM_zps5bb81b68.png ScreenShot2013-06-04at31220PM_zps0d02f5fc.png
Reporting & Analytics | | BizDetox0 -
Site Crash Effect On Traffic
All, I manage a site that unfortunately crashed due to a server issue in late October for about 3 hours. Prior to the crash, traffic was the best it had ever been in the 3+ year history of the site. As you might expect, since the crash traffic has gone gradually down and is now about 15% off pre-crash numbers. I understand that when a site crashes, it disrupts the crawling process and can disrupt traffic (in my case rich snippets were thrown off for days) but would love to hear experiences any of you have had in similar situations. How much did traffic drop after a crash? When did it recover? Other thoughts? Thanks, John
Reporting & Analytics | | JSOC0 -
Question on regular expression for filters on GA
Hi guys, I am creating profiles on some of the countries sites in my network, and have managed to establish the filter for tracking certain url patterns, for example: ^/japan-english- is tracking all my urls in the Japan site that start by japan-english great! however, it is not tracking the japanese instance of the urls. The pattern for the latter is : www.mysite.org/jp/japan-english I could then modify the filter to track the jp subfolder like this: ^/jp/japan-english- but it will then only track the urls on the /jp/ subfolder does anyone know the regex command for tracking the two url patters as follows: /jp/japan-english- & /japan-english- thanks in advance david
Reporting & Analytics | | BritishCouncil0