Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why personal change coach not ranking for his own name, exact match
The term is tim hallbom The new website is TimHallbom.com I was under the impression that someone could take over rank for their own name. Let me know what we're doing wrong. Thanks.
Link Building | | BobGW0 -
What's Your #1 High Authority Backlink Strategy For 2015?
Hey everyone, I am gathering responses from the SEO community regarding the two questions below. I will feature your (legitmate) responses as part of a study for current SEO backlink methods being used by the SEO community. There are no wrong answers. I look forward to seeing your responses and make sure to add your social profile details. 1. What is your number 1 method of acquiring high authority backlinks?
Link Building | | kirkbowlen
2. What platforms/ software do you use or recommend as part of your method? Thanks, Kirk Bowlen0 -
Link building for a 'boring' eCommerce site
Hey all, I'm in need of some serious inspiration and I'm hoping one of you generous people can help. We're looking to do a spot of link building for an eCommerce site which I would describe (without putting it down too much) as 'boring'. This isn't our fault per se as the site sells printer cartridges and there's only a certain amount of excitement such a product can create. As a result natural linkbuilding is quite tricky, especially when it comes to creating links to anything other than our homepage. For the past couple of years we have focused on creating good content and indeed have a thriving database of hundreds of helpful articles and videos on the site. These are generating decent traffic (and rank #1 for many different help related search terms) but the problem with these articles is that they don’t convert in to sales at all well. I’m also conscious that it will ultimately end in us developing a link profile which would classify us as a printer repair resource rather than cartridge sales website. I have read numerous guides on Moz but they all tend to focus on products which are a little sexier than those that we sell. Getting someone to share and link to a truly unique product would be incredibly easy compared with asking someone to engage, share and link to the latest Epson cartridge. I started writing articles on some decent quality business blogs however the links given don’t look particularly natural and the link would always be to our homepage. We have the staff and the time but we built a whole host of rubbish links back before Penguin and I want to ensure that we are going to head in the right direction before embarking on something new. If you had a site which was really dull how would you devise a link building strategy that was relevant and, most importantly, natural? Thanks for your help. Chris
Link Building | | ChrisHolgate1 -
Should you do a disavow even if you don't have a manual penalty?
If you are working on a website which has a history spammy links, but no manual penalty by Google... is it still worthwhile to still go through the link removal and disavow process? Thoughts appreciated.
Link Building | | Gavo0 -
I have well over 110 identical backlinks of widely varying quality - is it worth taking the page they're linking to down?
The site on which I'm working has been experiencing a month-to-month decline since the Panda and Penguin updates - I didn't get an unnatural links notification, but have clearly fallen off the face of Google with many of the more important rankings. After running a scan on my backlinks, I found over 110 identical backlinks (it looks like the same medical definition and my website is listed in the endnotes as a source - just the URL, not anchor text), some from reputable websites with high pageranks and others that look very 'spammy.' We've redesigned the architecture of the site, so the actual link itself has a 301 redirect on it, but I'm just wondering exactly how much of a liability is it to have these links out there? I'm guessing it's an all or nothing kind of thing given the identical content on each page - on one hand, I'm pretty frantic to get to the root of the Google penalties and get back in their good graces. On the other hand, I don't want to kill the site completely by going after a set of valuable links. Has anyone dealt with this before?
Link Building | | travis-taylor0 -
Links aren't showing up in SEOMOZ resports
Hi, I've been building links to my client's website for the past 3 weeks. I know that there are several sites that link to my client's website now but SEOMOZ's link analyses says there aren't any sites linking to my client's website. Anybody know what's up with that? Sincerely, Rex
Link Building | | Rex0 -
Finding competitor's 301's
Is there a way to find out what domains a competitor has 301'd to their main site? I am wondering if a competitor has hidden back links using a 301'd site.
Link Building | | EugeneF0 -
Changing Category Names in Wordpress Blog & Inbound Links
I'm working on a wordpress blog, /www.athleticfoodie.com, and I'd like to change the names of the categories so that I include keywords. If I change the current category name of "Healthy Recipes" to "Recipes for Athletes" will I lose the inbound links associated with the category and/or posts? Is there a way to prevent this?
Link Building | | EricVallee340