Google crawling 200 page site thousands of times/day. Why?
-
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day.
I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief.
So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks!
*edited for grammar.
-
I have a final update for everyone! We discovered the cause of the mysterious increase in crawling. One of our partners tested out a second version of the content on the website (yes, we have two complete sets of content for every page) by swapping out the first set with the second set. The second set caused Google to reevaluate the entire website, crawl it repeatedly thousands of times for two weeks, then stop.
The result of this refresh was a jump in the rankings. We were ranking on page one for about 15% of our targeted keywords and after the new content was inputted it jumped to 71%. Only time will tell if those new rankings will stick, but for now it looks pretty good.
-
Update: after about two weeks the crawl rate returned to normal. We haven't been able to identify a cause yet.
-
It is strange. It's definitely worth looking at access logs and analyzing crawler data there so you can see what pages are getting hit by the crawler just to be sure you understand the activity.
-
Well I would be more then happy if Google would visit my pages more often then once a day. We have around 100k original pages and we also see them visiting 250k pages daily with uplifts to 350k+ which I don't consider to be a bad thing. As long as you're sure about the fact that they see the right pages I would say it's a good thing. The crawl rate really varies day over day for any site, sometimes you get a high rate for a while and then it drops again when Google will find out that your site isn't creating that much new fresh content anymore.
Curious about your idea with the sitemap priority, to my experience + knowledge it doesn't change anything.
-
Yes I have, and yes there are pages that aren't listed in the sitemap and aren't supposed to be there. That's being corrected (we're considering experimenting with priority tags in the sitemap to see if it has an impact over just immediately blocking with robots.txt or meta robots). But if you factor in those pages, it still only amounts to 303 pages.
Weird, right?
-
Have you tried scanning the site with something like screaming frog to make sure there aren't pages that just aren't listed in the sitemap? Ie. tag or category pages, images or other partial content pieces that are creating pages.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No Index thousands of thin content pages?
Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken
Intermediate & Advanced SEO | | KenSchaefer0 -
Google Indexing our site
We have 700 city pages on our site. We submitted to google via a https://www.samhillbands.com/sitemaps/locations.xml but they only indexed 15 so far. Yes the content is similar on all of the pages...thought on getting them to index the remaining pages?
Intermediate & Advanced SEO | | brianvest0 -
How do i prevent Google and Moz from counting pages as duplicates?
I have 130,000 profiles on my site. When not Connected to them they have very few differences. So a bot - not logged in, etc, will see a login form and "Connect to Profilename" MOZ and Google call the links the same, even though theyre unique such as example.com/id/328/name-of-this-group example.com/id/87323/name-of-a-different-group So how do i separate them? Can I use Schema or something to help identify that these are profile pages, or that the content on them should be ignored as its help text, etc? Take facebook - each facebook profile for a name renders simple results: https://www.facebook.com/public/John-Smith https://www.facebook.com/family/Smith/ Would that be duplicate data if facebook had a "Why to join" article on all of those pages?
Intermediate & Advanced SEO | | inmn0 -
Will Google read my page title and H1?
Dim strTitle : strTitle = "The Title Of My Page" <title>Company name - <%=strTitle%></title> <%=strTitle%> Will Google be able to read this? When I view source the relevant information is in the tags but I'm wondering if Google hates this or not? Cheers!
Intermediate & Advanced SEO | | Hughescov0 -
Cleaning up /index.html on home page
All, What is the best way to deal with a home page that has the /index.html at the end of it? 301 redirect to the .com home page? Just want to make sure I'm not missing something. Thanks in advance.
Intermediate & Advanced SEO | | JSOC0 -
Does Google crawl the pages which are generated via the site's search box queries?
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
Intermediate & Advanced SEO | | pulseseo0 -
Site navigation menu in head of page for SEO
We are considering expanding our site navigation menu (horizontal) at the top of our pages. However, once implemented, this would include around 30-40 links at the top of the page; before the content of the page. How much effect (good/bad) would this have on SEO? Are their any other options? (perhaps rendering the menu after the main content with CSS)? Any thoughts, suggestions or ideas would be greatly appreciated.
Intermediate & Advanced SEO | | Peter2640