Help Blocking Crawlers. Huge Spike in "Direct Visits" with 96% Bounce Rate & Low Pages/Visit.
-
Hello,
I'm hoping one of you search geniuses can help me.
We have a successful client who started seeing a HUGE spike in direct visits as reported by Google Analytics. This traffic now represents approximately 70% of all website traffic. These "direct visits" have a bounce rate of 96%+ and only 1-2 pages/visit. This is skewing our analytics in a big way and rendering them pretty much useless. I suspect this is some sort of crawler activity but we have no access to the server log files to verify this or identify the culprit. The client's site is on a GoDaddy Managed WordPress hosting account.
The way I see it, there are a couple of possibilities.
1.) Our client's competitors are scraping the site on a regular basis to stay on top of site modifications, keyword emphasis, etc. It seems like whenever we make meaningful changes to the site, one of their competitors does a knock-off a few days later. Hmmm.2.) Our client's competitors have this crawler hitting the site thousands of times a day to raise bounce rates and decrease the average time on site, which could like have an negative impact on SEO. Correct me if I'm wrong but I don't believe Google is going to reward sites with 90% bounce rates, 1-2 pages/visit and an 18 second average time on site.
The bottom line is that we need to identify these bogus "direct visits" and find a way to block them. I've seen several WordPress plugins that claim to help with this but I certainly don't want to block valid crawlers, especially Google, from accessing the site.
If someone out there could please weigh in on this and help us resolve the issue, I'd really appreciate it. Heck, I'll even name my third-born after you.
Thanks for your help.
Eric
-
Hi SirMax,
Thanks for your input. I appreciate it. We'll add Wordfence to our WordPress toolbox and see if that addresses the issue.
In response to previous posts, thanks to everyone for your input. We were able to apply some filters to remove the bogus bot traffic from the analytics and normalize the data, however, this did not actually resolve the issue and in my eyes is more of a BandAid fix. The evil crawlers are still there, we just can't see them.
Thanks again for all of your input.
Eric
-
Hostname filtering does not work any more. Unfortunately most of the spammers have adapted and are using your website as hostname.
For the WordPress I use Wordfence plugin( using paid version - not affiliated with them in any shape or form beyond paying for their services). In the advance blocking you can set limits on how fast and how many pages crawlers can request. You can also block by country or ip range. It can also show you live traffic with a lot of details ( a lot more then google analytic - more like server log ). It might not be the complete remedy but it can help.
-
I wish I had an answer for how to stop the bots from hitting your site at all - I don't think a good one exists, as any solutions that wouldn't also block real human traffic to your site are going to be easy for spam bots to get around. I think your best bet is just to do everything you can to keep your data as clean as possible.
-
Hi Ruth,
Thanks a bunch for taking the time to respond to my post. Great advice. This is reassuring on a number of levels, however, it doesn't address the underlying issue of how to stop these spam bots in the first place.
We've already started the process of filtering out some of this bogus data. We'll also be integrating some WordPress plugins to see if that helps. That said, if the spam bots are hitting Analytics directly, as opposed to the actual website, WP plugins won't do anything.
Anyway, I appreciate your input and advice. Thanks so much.
Eric
-
Hi Eric,
A few things to reassure you off the bat:
- For what it's worth, there is a huge, HUGE amount of crawler spam happening in the web today. Every site I work on is being hit hard with false referrals and direct visits. I know Google Analytics is working on a solution to better filter these visits out. So I wouldn't be too concerned that it is something a competitor is doing to your site, specifically - it's more likely that it's been caught up in the general wave of spam crawlers.
- It's important to note that when we talk about Google looking at bounce rate and dwell time as part of ranking your site, those numbers are specifically from clicks through from search - that's data that Google can get without using your private web analytics data as a ranking factor, which they've said repeatedly that they don't and won't do. So a bunch of direct visits with high bounce rates will NOT affect your rankings.
So, it's not dangerous, just annoying. On to how to get that data out of your reports:
- Make sure you're not filtering out spam referrers at a View level - this can cause those visits to incorrectly appear as direct traffic.
- You could set up an Advanced Segment in Google Analytics to filter out direct visits with visit times of, say, under 5 seconds. Some real traffic may get caught in that, but it will get the noise levels down.
- The best way to filter out spam bot traffic, in my opinion, is to set up hostname filtering. Here's a post on Megalytic on how to do that: https://megalytic.com/blog/how-to-filter-out-fake-referrals-and-other-google-analytics-spam. Make sure you've also got an "Unfiltered Data" View so you'll still have historic raw data if you need it.
Hope that helps! Good luck.
-
Check webserver log files, or log visits (ip address, user agent, __utma, __utmz, possibly browser fingerprint, etc...)
Analyzing those you can easily find out if the traffic is from scraping bot or humans.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking Standard pages with Robots.txt (t&c's, shipping policy, pricing & privacy policies etc)
Hi I've just had best practice site migration completed for my old e-commerce store into a Shopify environment and I see in GSC that it's reporting my standard pages as blocked by robots.txt, such as these below examples. Surely I don't want these blocked ? is that likely due to my migrators or s defaults setting with Shopify does anyone know? : t&c's shipping policy pricing policy privacy policy etc So in summary: Shall I unblock these? What caused it Shopify default settings or more likely my migration team? All Best Dan
Reporting & Analytics | | Dan-Lawrence0 -
How To Stop Google's "Fetch & Render" From Showing Up In Google Analytics
Hi all, Within Google's "Fetch & Render" (found in Google Search Console) is the ability to index certain pages from my website on-demand. Unfortunately, every time I ask Google to index a page, it registers as a bounce in Google Analytics. Also, if it means anything, my website (www.knowtro.com) is a single-page application, functioning similarly to Google. If you guys know of any solution to this problem, please help! I originally thought that Google would know to block its own Fetch & Render crawler from Google Analytics but that doesn't seem to be the case. Thanks, Austin
Reporting & Analytics | | A_Krauss0 -
How to set goal in Google Analytics that required specific page
So our company has new page that has just implemented (let say "page x" --> not a landing page) and we want to see how many visitors that through "page x " convert into the goal (let say "page y"). If I just make the goal destination like "/page y" the goal number that appear is ALL the visitors who reach "page y" (through or not through "page x"), so how I set the goal setting to only show the visitors who reach "page y" through "page x" ? Thank you
Reporting & Analytics | | ddspg0 -
Universal Analytics: Why does Google Organic appear as Direct traffic?
Hi there, When I enter the site via Google Search and follow myself via Real-Time Analytics I appear an organic visitor (which is good). When I browse and visit the site I still am an organic visitor. However, as soon as I fill in the contact form (gravity forms) and land on the "thank you page" I appear as a direct visitor with Google as the source. Since I have the thank you page set-up as a goal, Analytics incorrectly attributes these conversions to the direct medium instead of the organic medium. The tracking code has been installed on all the pages and all conversions are being recorded. What is going on?
Reporting & Analytics | | Robbern0 -
SEO Working GREAT! Although Keyword Rankings Only Directing to Home Page.
Since I've become a fellow MOZer I've seen an increase on my sites and proud to say that ALL of my clients have now accomplished 1st page rankings on at least 2 of their top 3 keywords. Yes, I am as stoked as them. Although I am a bit worried because as I rank keywords and test them on Google and engines alike all of our rankings direct to the homepage. I am essentially new to SEO. So as happy as I am to be doing something right I'm worried I'm losing the end user. By having them need to Search our site/menus/blog for the keywords/phrase that got them to us will turn most searchers away. They want the content now. What am I doing wrong? My pages are pretty specific so I never cared to use rel-canonical's...I dont have duplicate content, other than your typical footer/side bar consistent info. So I'm curious where I've gone wrong. Little history on my sites: I Use Joomla 2.5 & 3.x. I manage seo details through sh404seo. I've cleaned all duplicates I can up other than what these dynamic cms generate by just using one. I've done fairly well and all my pages rank A for moz's onpage grader for the keyword optimized around. Any help is appreciate, THANK YOU!
Reporting & Analytics | | Funk-Creative-Media0 -
Feedburner stats show drop in multiple feeds from 6/21/13 to 6/22\. Anyone else seen the same?
Client has 3 RSS feeds in Feedburner, and they saw a drop in subscribers from 6/21 to 6/22. Wondering if anyone else has saw a drop recently. Obviously, this could be the results of multiple people cleaning up their feeds, but it appears too coincidental. Has anyone heard if there are any recent issues with Feedburner tracking?
Reporting & Analytics | | DanaLookadoo0 -
Tracking Social Media Logged In Users Help
Hi, I recently read Tom Anthony's post on "Monitor Which Social Networks Your Visitors are Logged Into With Google Analytics". I have looked and re-looked at the code implemented and after two days I am still not getting any results tracking. Could somebody please take a look at my site and tell me if they see any conflicts or errors I may have made when installing the code. Thanks for any help!
Reporting & Analytics | | BryanCasson0 -
Conversion rates by browser & OS - any feedback/experts/experience?
Hi, Ive been evaluating conversion rates by operating system and by browser for a client. Ive picked up significant and somewhat disturbing trends. As you'd expect the bulk of traffic is coming from a Windows/Internet Explorer combination. This is unfortunately one of the worst combinations (Windows/Firefox & Windows/Safari did worse. Chrome/Windows was significantly the best combination with Windows). Windows also performs much worse than Mac. E.g. Windows/Firefox performs worse than Mac/Firefox. Overall conversion rate for Mac is 7.07% compared to 5.69% Windows. This is based on hundreds of thousands of visits and equates to tens of thousands of dollars difference in revenue. Generally later versions of browsers perform better on both main operating systems e.g IE 9.0 converts at 6.33% compared to 8.0 at 5.80% on Windows and Firefox 4.01 on the Mac converts at 7.57% compared to 3.6.16 at 6.54% (although this dataset is smaller than Windows/IE). Page load speeds (recorded in the clients analytics) are significantly faster on Mac than Windows (as expected really). Being Windows/IE and specifically Windows IE8 represents the bulk of traffic should we be addressing this? Will any optimisation negatively affect better performing Mac/Browser combinations? Understanding that Mac users equate to 'better' converting visitors - what else could be done there? Anyone have thoughts or experience on optimising pages for improved conversion rates via IE and Windows? Thanks in advance, Andy
Reporting & Analytics | | AndyMacLean0