Help Blocking Crawlers. Huge Spike in "Direct Visits" with 96% Bounce Rate & Low Pages/Visit.
-
Hello,
I'm hoping one of you search geniuses can help me.
We have a successful client who started seeing a HUGE spike in direct visits as reported by Google Analytics. This traffic now represents approximately 70% of all website traffic. These "direct visits" have a bounce rate of 96%+ and only 1-2 pages/visit. This is skewing our analytics in a big way and rendering them pretty much useless. I suspect this is some sort of crawler activity but we have no access to the server log files to verify this or identify the culprit. The client's site is on a GoDaddy Managed WordPress hosting account.
The way I see it, there are a couple of possibilities.
1.) Our client's competitors are scraping the site on a regular basis to stay on top of site modifications, keyword emphasis, etc. It seems like whenever we make meaningful changes to the site, one of their competitors does a knock-off a few days later. Hmmm.2.) Our client's competitors have this crawler hitting the site thousands of times a day to raise bounce rates and decrease the average time on site, which could like have an negative impact on SEO. Correct me if I'm wrong but I don't believe Google is going to reward sites with 90% bounce rates, 1-2 pages/visit and an 18 second average time on site.
The bottom line is that we need to identify these bogus "direct visits" and find a way to block them. I've seen several WordPress plugins that claim to help with this but I certainly don't want to block valid crawlers, especially Google, from accessing the site.
If someone out there could please weigh in on this and help us resolve the issue, I'd really appreciate it. Heck, I'll even name my third-born after you.
Thanks for your help.
Eric
-
Hi SirMax,
Thanks for your input. I appreciate it. We'll add Wordfence to our WordPress toolbox and see if that addresses the issue.
In response to previous posts, thanks to everyone for your input. We were able to apply some filters to remove the bogus bot traffic from the analytics and normalize the data, however, this did not actually resolve the issue and in my eyes is more of a BandAid fix. The evil crawlers are still there, we just can't see them.
Thanks again for all of your input.
Eric
-
Hostname filtering does not work any more. Unfortunately most of the spammers have adapted and are using your website as hostname.
For the WordPress I use Wordfence plugin( using paid version - not affiliated with them in any shape or form beyond paying for their services). In the advance blocking you can set limits on how fast and how many pages crawlers can request. You can also block by country or ip range. It can also show you live traffic with a lot of details ( a lot more then google analytic - more like server log ). It might not be the complete remedy but it can help.
-
I wish I had an answer for how to stop the bots from hitting your site at all - I don't think a good one exists, as any solutions that wouldn't also block real human traffic to your site are going to be easy for spam bots to get around. I think your best bet is just to do everything you can to keep your data as clean as possible.
-
Hi Ruth,
Thanks a bunch for taking the time to respond to my post. Great advice. This is reassuring on a number of levels, however, it doesn't address the underlying issue of how to stop these spam bots in the first place.
We've already started the process of filtering out some of this bogus data. We'll also be integrating some WordPress plugins to see if that helps. That said, if the spam bots are hitting Analytics directly, as opposed to the actual website, WP plugins won't do anything.
Anyway, I appreciate your input and advice. Thanks so much.
Eric
-
Hi Eric,
A few things to reassure you off the bat:
- For what it's worth, there is a huge, HUGE amount of crawler spam happening in the web today. Every site I work on is being hit hard with false referrals and direct visits. I know Google Analytics is working on a solution to better filter these visits out. So I wouldn't be too concerned that it is something a competitor is doing to your site, specifically - it's more likely that it's been caught up in the general wave of spam crawlers.
- It's important to note that when we talk about Google looking at bounce rate and dwell time as part of ranking your site, those numbers are specifically from clicks through from search - that's data that Google can get without using your private web analytics data as a ranking factor, which they've said repeatedly that they don't and won't do. So a bunch of direct visits with high bounce rates will NOT affect your rankings.
So, it's not dangerous, just annoying. On to how to get that data out of your reports:
- Make sure you're not filtering out spam referrers at a View level - this can cause those visits to incorrectly appear as direct traffic.
- You could set up an Advanced Segment in Google Analytics to filter out direct visits with visit times of, say, under 5 seconds. Some real traffic may get caught in that, but it will get the noise levels down.
- The best way to filter out spam bot traffic, in my opinion, is to set up hostname filtering. Here's a post on Megalytic on how to do that: https://megalytic.com/blog/how-to-filter-out-fake-referrals-and-other-google-analytics-spam. Make sure you've also got an "Unfiltered Data" View so you'll still have historic raw data if you need it.
Hope that helps! Good luck.
-
Check webserver log files, or log visits (ip address, user agent, __utma, __utmz, possibly browser fingerprint, etc...)
Analyzing those you can easily find out if the traffic is from scraping bot or humans.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rogerbot directives in robots.txt
I feel like I spend a lot of time setting false positives in my reports to ignore. Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
Reporting & Analytics | | awilliams_kingston0 -
Need help understanding what happened to our organic search.
I help run an ecommerce business that mainly runs on Google organic search (yes, I realize this is not a good place to be). Last April, we saw around 25-30% of our organic search cut. I’m pretty sure it was due to the mobile update but we had some changes in the past month or two before that. I’m looking for someone to review my analytics account and see what happened. Possibly this is called an audit? Also, we’re looking to ask some questions about long term strategy as we are thinking about a redesign and switching to a new platform. Maybe more content? Maybe more social?
Reporting & Analytics | | kirbyf0 -
Has anyone had any success when requesting help from Bing?
For the past month I have been struggling to resolve a problem with an increasing number of 404 crawl errors in Bing. (Only Bing reports these error, not Google and not Moz.) I did receive email from them requesting a server log (which I have now sent 5 times.) Their response, please send a server log. The last email I sent (6/21) explained there should be server log attached, if they did not get it, not why not? Do they have a restriction on file size? What other compression types do they accept? It has been very frustrating to deal with them and all of their responses are between 9 PM and midnight. Almost a week later, no reply. Any suggestions how I can get out of this seemingly endless loop and get a helpful response during normal business hours (PDT)?
Reporting & Analytics | | PerriCline0 -
Google Analytics Not Tracking 100% of Visits?
Hi all, We're having an issue with Analytics where we are getting different figures from what Silver Pop are saying. For example email campaign A sent via Silver Pop, with Google Analytics tracking code show's 50 unique clicks in Silver Pop. Looking at Google Analytics there are only 10 visits from that campaign. So I thought it could be something with the tracking, but there wasn't a significant rise in web visits = either Google Analytics is not recording visits properly or Silver Pop figures are wrong. I'm more inclined to think that it's something to do with Google Analytics. Has anyone come across something similar? Where one system is showing you X amount of visits but the figures on Google Analytics don't add up? A few quick things already covered: Double checked the links have been tracked properly, but this doesn't explain the low increase in web visits generally We've double checked that Google Analytics tracking code is properly installed (and it is / was at the time of send). Any help would be much appreciated! Thanks guys.
Reporting & Analytics | | RKHStaff1 -
Webmaster tools help
Looking at some Google Webmaster tools data and im a bit stumped. Can anyone tell me if the "not selected" pages should be taken from the "total indexed pages" or are they a seperate entity? The way the data is displayed on the graph leads you to believe they should be taken from the total indexed pages, yet Googles own description of "not selected" pages says that these pages are not indexed. Confused.
Reporting & Analytics | | Silkstream0 -
Core audience, daily / weekly uniques
In Google analytics or site catalyst, is there anyway to target IP addresses to see how many of my daily uniques are daily, weekly, monthly and what % are just 'tourists'/'one-offs'? My site is a local news site... Or is there another way to find this info. I am trying to work out what my core audience is (that keeps coming back every day or every week). Kind regards
Reporting & Analytics | | MirandaP0 -
Can you help me figure out what happened to my website search results in Google?
On or about the 24th of April I noticed an abrupt decrease in traffic to my website:
Reporting & Analytics | | rdominey
http://www.getyourphotosoncanvas.com Sorry this might be long but I’m trying to be as thorough as possible. I thought that I had been hacked, a virus, maybe penalized by Google I don’t know what ? I submitted a reconsideration request to Google and they responded with the following: Reconsideration request for http://www.getyourphotosoncanvas.com/: No manual spam actions found
May 10, 2012
Dear site owner or webmaster of http://www.getyourphotosoncanvas.com/,
We received a request from a site owner to reconsider http://www.getyourphotosoncanvas.com/ for compliance with Google's Webmaster Guidelines. - - - - - -
We reviewed your site and found no manual actions by the webspam team that might affect your site's ranking in Google. There's no need to file a reconsideration request for your site, because any ranking issues you may be experiencing are not related to a manual action taken by the webspam team.
Google Search Quality Team I have ran all kinds of web crawl tests, Google webmaster, talked with SEO “Experts” and still can not figure out what is happening. I decided to use a couple of SEOmoz tools to try to help me explain what is happening. I figured that if I could take a very specific and unique KeyPhrase and run it on a specific page that I might be able to better explain what is happening. Basically, We appear to be no longer searchable by key words or phrases on google?
Here is an example:
Key Phrase: Free Services to Help Improve Your Photos on Canvas
Website: http://www.getyourphotosoncanvas.com/free-photo-canvas-retouching
Attached are some screen shots of the actual search results on Bing, Yahoo and Google along with the ranking tool results from SEOmoz and the on page grade for the key phrase.
Anybody got any Ideas? I am hurting; the internet and Google search is about 40% of by business. http://www.getyourphotosoncanvas.com/wp-content/uploads/2012/05/Bing-Free-Services.jpg http://www.getyourphotosoncanvas.com/wp-content/uploads/2012/05/Yahoo-Free-Services.jpg http://www.getyourphotosoncanvas.com/wp-content/uploads/2012/05/Google-Free-Services.jpg http://www.getyourphotosoncanvas.com/wp-content/uploads/2012/05/SEOmoz-Ranking.jpg http://www.getyourphotosoncanvas.com/wp-content/uploads/2012/05/SEOmoz-Report-Card.jpg [" target="_blank">iframe>](<iframe class=) Bing-Free-Services.jpg Yahoo-Free-Services.jpg Google-Free-Services.jpg SEOmoz-Ranking.jpg SEOmoz-Report-Card.jpg0 -
Problem when searching for "link:www.mysite.com" vs "link: www.mysite.com"
Why does a search for "link:www.mysite.com" show no results, but when there is a space before www.mysite.com it shows results? The same happens for "link:www.mysite.com" (nothing shows up), but when I search for "link:www.mysite.com/index.php" it returns results. Is there a problem I am missing? Thanks so much!
Reporting & Analytics | | EmilyP0