Significant Google crawl errors
-
We've got a site that continuously like clockwork encounters server errors with when Google crawls it. Since the end of last year it will go a week fine, then it will have two straight weeks of 70%-100% error rate when Google tries to crawl it. During this time you can still put the URL in and go to the site, but spider simulators return a 404 error. Just this morning we had another error message, I did a fetch and resubmit, and magically now it's back. We changed servers on it in Jan to Go Daddy because the previous server (Tronics) kept getting hacked. IIt's built in html so I'm wondering if it's something in the code maybe?
-
This is the URL error list in Webmaster Tools
| Forms/Camp.pdf | 404 | 7/9/13 |
| | 2 | sportsinsurance.php | 404 | 5/2/13 |
| | 3 | Forms/Waiver.pdf | 404 | 7/2/13 |
| | 4 | metro/index.htm | 404 | 6/21/13 |
| | 5 | Forms/Camp_Tournament_Application.pdf | 404 | 7/9/13 |
| | 6 | Forms/Spectator.pdf | 404 | 7/9/13 |
| | 7 | Forms/Boxing.pdf | 404 | 5/6/13 |
| | 8 | sports-camp-insurance.html | 404 | 6/16/13 |
| | 9 | forms/T.C.S._ | 404 | 7/3/13 |
| | 10 | Camp | 404 | 6/14/13 |
| | 11 | Forms/Sports.pdf | 404 | 4/21/13 |
| | 12 | pages/clients.html | 404 | 4/15/13 || |
http://www.campteam.com/: Googlebot can't access your site****July 10, 2013
Over the last 24 hours, Googlebot encountered 13 errors while attempting to connect to your site. Your site's overall connection failure rate is 72.2%.
I've got 23 of these messages going back to 11/12
It tells me that no Robots.txt Fetch issues were encountered, or DNS issues. All errors are related to server connectivity according to Google.
-
I see that your site is dealing fine with 404 errors. Hrmm. Could you copy and paste the crawl error URLs you are getting from webmaster tools? Thanks!
BTW I noticed that you have a duplicate content issue in that you haven't removed the www from your URL. You should add the following code to your .htaccess file.
<code class="htaccess" title="in your .htaccess file">RewriteEngine On RewriteCond %{HTTP_HOST} !^my-domain\.com$ [NC] RewriteRule ^(.*)$ http://my-domain.com/$1 [R=301,L]</code>
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google slow to index pages
Hi We've recently had a product launch for one of our clients. Historically speaking Google has been quick to respond, i.e when the page for the product goes live it's indexed and performing for branded terms within 10 minutes (without 'Fetch and Render'). This time however, we found that it took Google over an hour to index the pages. we found initially that press coverage ranked until we were indexed. Nothing major had changed in terms of the page structure, content, internal linking etc; these were brand new pages, with new product content. Has anyone ever experienced Google having an 'off' day or being uncharacteristically slow with indexing? We do have a few ideas what could have caused this, but we were interested to see if anyone else had experienced this sort of change in Google's behaviour, either recently or previously? Thanks.
Intermediate & Advanced SEO | | punchseo0 -
Would google consider this the anchor text?
Hi guys, For a button based link, can you define the anchor text google will use. I have attached screenshot of what i mean. Cheers. geavig
Intermediate & Advanced SEO | | bridhard80 -
Why doesn't my website crawl by Google?
Hi mozzers and members, I am having issues, why my website: http://profilecosmeticsurgery.com/ crawl by Google? let me share more clearly when this starts happening. A month or around 45 days back our website is being indexed and crawled quite well without any issues with having .html extension pages with static built website.
Intermediate & Advanced SEO | | SEOOOOOoooooooo
We finally thought to change to .php version and make whole website and its pages to be treated dynamically.
Once we changed all changes, thereafter this issues started. It has been more than 45 days, our website isn't being crawled since then. I didn't know what are the things preventing this to? Please help. Thanks in Advance Capture1.PNG0 -
Google and PDF indexing
It was recently brought to my attention that one of the PDFs on our site wasn't showing up when looking for a particular phrase within the document. The user was trying to search only within our site. Once I removed the site restriction - I noticed that there was another site using the exact same PDF. It appears Google is indexing that PDF but not ours. The name, title, and content are the same. Is there any way to get around this? I find it interesting as we use GSA and within GSA it shows up for the phrase. I have to imagine Google is saying that it already has the PDF and therefore is ignoring our PDF. Any tricks to get around this? BTW - both sites rightfully should have the PDF. One is a client site and they are allowed to host the PDFs created for them. However, I'd like Mathematica to also be listed. Query: no site restriction (notice: Teach for america comes up #1 and Mathematica is not listed). https://www.google.com/search?as_q=&as_epq=HSAC_final_rpt_9_2013.pdf&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=&gws_rd=ssl#q=HSAC_final_rpt_9_2013.pdf+"Teach+charlotte"+filetype:pdf&as_qdr=all&filter=0 Query: site restriction (notice that it doesn't find the phrase and redirects to any of the words) https://www.google.com/search?as_q=&as_epq=HSAC_final_rpt_9_2013.pdf&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=&gws_rd=ssl#as_qdr=all&q="Teach+charlotte"+site:www.mathematica-mpr.com+filetype:pdf
Intermediate & Advanced SEO | | jpfleiderer0 -
Robot.txt error
I currently have this under my robot txt file: User-agent: *
Intermediate & Advanced SEO | | Rubix
Disallow: /authenticated/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /PayPal/
Disallow: /Reporting/
Disallow: /RegistrationComplete.aspx WebMatrix 2.0 On webmaster > Health Check > Blocked URL I copy and paste above code then click on Test, everything looks ok but then logout and log back in then I see below code under Blocked URL: User-agent: * Disallow: / WebMatrix 2.0 Currently, Google doesn't index my domain and i don't understand why this happening. Any ideas? Thanks Seda0 -
Google+ Pages on Google SERP
Do you think that a Google+ Page (not profile) could appear on the Google SERP as a Rich Snippet Author? Thanks
Intermediate & Advanced SEO | | overalia0 -
HTTP Errors in Webmaster Tools
We recently added a 301 redirect from our non-www domain to the www version. As a result, we now have tons of HTTP errors (403s to be exact) in Webmaster Tools. They're all from over a month ago, but they still show up. How can we fix this?
Intermediate & Advanced SEO | | kylesuss0 -
What are the different tactics for getting ranked/ included in Google finance searches such as http://www.google.com/finance/company_news?q=NASDAQ:ADBE
I don't know what ranking factors they are using for this feed. The results vary greatly from a search done at google.com or google.com/news and google.com/finance I'm working with a website that regularly publishes finance-related news and currently gets traffic from google finance. I'm wondering what we can do to optimize our news articles to possibly show more prominently or more often. Thanks
Intermediate & Advanced SEO | | joemascaro0