How to Safely Scrape Google Results?
-
I've built a couple of small tools that I use personally, maybe 2 or 3 times per day.
Both tools scrape the top 10 results from Google and provide more details about each domain (like the SEOMoz Keyword Difficulty Tool).
Google seem to have banned my IP address for automated searches... can anyone tell me a safe way of scraping the google results? Is there a suitable API for this?
How do SEO Moz do this on such a huge scale?
-
As I doubt that the APIs have considerably improved since this blog post http://www.seomoz.org/blog/the-nasty-problem-with-scraping-results-from-the-engines, google scraping is still a big issue and necessary for our daily seo work.
Scraping savely can only work if you succeed in convincing Google that you're a "natural" user and not a scarping robot. How can you do that?
- Search with alternating IPs, from different locations using proxies from the countries where you'd like to scrape from
- don't send too many requests at once from the same source
Consider that, when requesting a URL, the browser sends various information elements to the server, containing, for example, your Operating System, browser version, referer, etc. - every element can and should be changed to virtually change your identity when executing a new search.
- change browsers, browser versions, operating system information, etc.
- take care when changing browser localization values (en-GB, en-US probably don't return the same results)
- have a good network of proxy servers ready to send the different requests with your different identities to
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do about one site dominating search results? (multiple pages ranking)?
Anybody have thoughts on dealing with search results where the same site gets listed multiple times? "weebly vs wix" is one example (same site #1-3, repetitive articles, not crazy high authority), but I see this now and then. I know Google likes variety, so it's weird for me to see results like this dominating search results. Thoughts? What gets these sites to take over the top rankings for a specific term? Any way to rise up in this situation, outside of the usual? Any tips on duplicating this kind of success?
Competitive Research | | davidwaring0 -
My site is ranked in the top 5 for my keywords, but howcome I'm low in an organic search results for my key words?
Are the other factors such as page rank, Alexa rating and mozRank used to determine where I will show up in search results, over Goggle's key word rank for my key words?
Competitive Research | | allstatetransmission0 -
How come the results in Google vary with domains
Hello, How is everyone doing? My question is about the google search engine results page. How come some results have the www. in front of them and some don't. Also what are the SEO implications of having www. in front of your search results vs. not. Is this something to do with canonical? I have included a screen shot so you will see what I mean. One result is www.gearyi.com and the result without the www is ingenexdigital.com. R6GLL.png
Competitive Research | | digitalops0 -
What is the best way to know your accurate ranking on google? what analytics are most trustworthy?
with seomoz, i rank on google in one spot. With market samurai, I rank in another spot, With KPMRS (or whatever) i rank another place. But then when i just type my keywords into google, it puts me in a whole different world... am confused... and the client i am trying to help rank higher is confused. Thanks for any help you can offer be blessed bd
Competitive Research | | creativeguy0 -
What are the competition's Google Places pages optimised for?
I'm doing some work on a client's Googe Places page, and wondered if there's any way to see what a completitors Places page is currently optimised or categorised for? Basically, we're trying to rank for 'Bathrooms Edinburgh' and almost all of the page 1 SERP's are (unsurprisingly) full of Places results, with #1 Organic slot right down at the bottom of the page. In short - we NEED to get our Places page kicked into shape, and pronto! So, is there any way to find out how the competition's Places pages are ranking so well? e.g. What have they categorised themselves under? Cheers in advance folks, JM
Competitive Research | | JamesMio0 -
What metrics are you using to build profiles of competitors Google Places pages?
Hi, I am attempting to put together a set of procedures to undertake when optimizing Google Places pages. Does anyone have a good list of metrics and methods for determining them in order to build profiles of competitors' Places pages? Thanks
Competitive Research | | networkelites0 -
Google Places - Client showed up before, now does not
This is a strange one, and I hope a few local experts are out there. My client basically has one major competitor in the market. The competitor is closer to downtown and he is out about 27 miles. A couple of months ago, if you searched on "biplane rides in atlanta" the places map in the SERPS would show two - my client and his competitor. Now, the initial local in-line serp just shows his competitor, zoomed in. If go to Google Maps and type in the same search, he is listed, but you first have to click show more results. Then, he's listed twice - one his airport address (which is the real one) and one his business registered address (his house). How would I go about straightening this out? My client is #1 in the natural SERPS, it's just this local thing drives us crazy. If anyone can figure this out, you may walk away with a biplane ride next time you're in Atlanta! Thanks, Charles
Competitive Research | | Chas-2957210 -
Multiple links from Dmoz/Google directories worldwide
I came across www.soundandvision.com and did a Link Analysis on them.... http://www.opensiteexplorer.org/www.soundandvision.com/a!links I noticed that the top links they have are from Google directories or Google IP's. How has this happened? I am listed in Dmoz in the UK does this mean I have automatically appeared around the world. Dmoz is pretty strict about rejecting links how can a company be listed so much? Is this a good practise? Cheers
Competitive Research | | JohnW-UK1