Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the importance of root domains linking to your website in Google's rankings? I notice our competition has a much higher number on keywords I'm analyzing. Thank you!
I've noticed our competition has a much higher number of "root domains" linking to their page than we do. Is this simply a result of more websites linking to them? How long does it normally take to build up these numbers/rankings? (I'm assuming it's a concerted effort, which I'll be researching.) Thank you!
Link Building | | mjfinet0 -
How long until links 'fall off'?
If I have site A linking to site B, and take down the links - does anyone have any experience in about how long they take to 'fall off', that is stop appearing in Webmaster Tools or Moz? I'm going on three weeks currently. Perhaps this takes months?
Link Building | | GFujioka0 -
Can't remove and can't disavow...
Wondering if anyone could help me. A client is suffering a Google penalty at the moment which is harming their performance. That said it is properly the best penalty they could have got..the site has a penalty which is stopping them ranking as high on keywords which they have been spamming links for. I say the best penalty they could have got because, other than these keywords, the site is still performing as well as before in the engines. Have been trying to clear up the backlinks for the past couple of months and just when I think im winning I discover some 32,000 blog comments on one domain 6 weeks or so ago!! - [http://www. iimpact.co.uk/blog/about?ss=%25&replytoc&rep&d=http://8;d=http://166",.moveForm("comment-1628", "1628", addComment.m&replytocom=265](http://www.iimpact.co.uk/blog/about?ss=%25&replytoc&rep&d=http://8;d=http://166",.moveForm("comment-1628", "1628", addComment.m&replytocom=265) example url, I broke it on purpose as to not make it a backlink With that many links i will not even try and contact the site, I put in the disavow in google that there is no way I can get these removed and explained it etc, however, they, again, rejected the reconsideration request saying they are still not happy with the links. At a bit of a loss here. It's starting to look that the best approach will be to write off the half a dozen terms and go a little more long tail. The site will not rank for (this is an example as would not want to give the client away) door knockers. They used to be first page but now are page four. However they remain page one for terms such as brass door knockers, cheap door knockers, chrome door knockers etc. In WMT ranking report they have hits from 1200 words ranking in the top 5 in the past 30 days and 3000 in the top ten. Im a considering trying to up the rankings of these words at the expense of the ones which we cannot seem to get the penalty off. Any thoughts please.
Link Building | | Grumpy_Carl0 -
I found a link to my customer's site with an over-optimized
I found a link to my customer's site with an over-optimized anchor text using the OpenSite Explorer tool, but when I opened the link, my customer's link is nowhere to be found. Here is the link Opensiteexplorer gave me: http://www.derbymadness.com/my-profile/demolition-derby-groups/viewgroup/12-web And here is my customer's link: 1896omalleyhouse.com The link is not on the cached version of the page either. Can anyone tell me how this might happen? Thanks!
Link Building | | Acorn-IS0 -
Weird change in amount of links
We just went from 50.000 external followed links to more than 150.000 ext followed links within a week. At the same time we went from just below 200.000 total links (internal/external) to more than 650.000 links and linking root domains dropped from around 750 to below 500. We don't do linkbuilding. We don't use a seo-agency. We do all stuff on our own. So why this major change and what impact will it have?
Link Building | | alsvik0 -
Open Site Explorer - Finding 404s from my competitor's external linking Root Domains
I want to find websites that have linked 404s to my competitors. My goal is to contact webmasters who have linked to my competitors with 404s, but I cannot figure out how to get Open Site Explorer to give me that data. I can easily see this in GWT for my own site, but I need this for my competition instead. Does anybody know a quick way to get this information?
Link Building | | Francisco_Meza0 -
Fresh set of eyes on this page please. Why isn't it ranking?
Morning all, I'd really appreciate it if you could take a quick look at this page and see if I'm missing something here. The targeted keyword (wedding favours) is pretty competitive and the rank had been slowly improving until recently and we've now slipped to 25th on Google UK. I've added a "Pay with a tweet" button for our eBook which has been pretty well received (around 100 downloads) so far so the social side is better than our competition. I've also written a few guest blogs with links back to the page from a variety of sources. Here is the page analysis on OSE. If you could take a quick look and let me know if I'm missing anything here, it would be most appreciated! Thanks in advance.
Link Building | | Confetti_Wedding0 -
I'm aware that spamming tools can lead to ruin..what do you use as a backlink strategy.
But I have had some luck with bookkmaRKING DEMON, and understand the pitfalls of massive submission, but I deal in facts not generalities. If the goal is to mix your links accross the pages in your site, and mix the DA quality, or pr of the links, and have them contextual as well as comments, directories, etc. who cares how it is done as long as it appears natural 1. Linkvana claims to be a posting service for you. you put text in with your link and it posts to relevant sites, the same as you would do manually. For my context links I see this as a better way than begging people to give me a link, or wasting time kissing butt of reporters (kike i did in the past). Anyway there is good and bad with everything. I want to know if anyone knows of anyone who used linkvana and what their experiences is ? And more importantly.... WHAT DOES EVERYONE USE AS A COMPLETE BACK LINK STRATEGY? Thanks
Link Building | | joemas990