Tons of Crappy links in new OSE (Open Site Explorer)
-
I am starting to miss the old OSE. I've found that for a lot of the pages on our site, the new OSE is showing WAY more links and most of them are garbage nonsense links from China, Russia, and the rest of the internet Wild West.
For instance, in the old OSE, this page used to show 9 linking domains:
http://www.uncommongoods.com/gifts/by-recipient/gifts-for-him
It now shows 454 links. Some of the new links (about 5 of them) are legitimate. The other 400+ are garbage. Some are porn sites, most of them don't even open a web page, they just initiate some shady download. I've seen this for other sites as well (like Urban Outfitters) This is making it much harder for me to do backlink analysis on bc I have no clue how many "Normal" links they have. Is anyone else having this problem ? Any way to filter all this crap out ? See attached screenshot of the list of links I'm getting from OSE.
-
Ok thank you. I will email directly.
-
Hey Zack,
Sorry to hear you're still having problems - we've seen an improvement on most sites at this point. Would you want to send me info on the site you're searching and any filters you are using?
If you don't feel comfortable posting that info on this thread, feel free to email me directly: [email protected].
Thanks!
Carin
-
Hey Carin,
I just wanted to follow up on this...I'm still seeing these spammy binary files show up as links. Unfortunately it makes OSE quite useless for me in regards to exploring our own backlinks.
What is the status of this problem? Has there been any headway ? Why does our site have problems but most others don't?
Thanks!
-Zack
-
Hey Zack,
Thanks so much for understanding! We are doing everything we can to get the bug resolved. Binary files are the downloadable files you see as links - .pdf, .exe, .img, etc.
I'm really sorry, but we don't have a URL to the old OSE. I saw Steven's response as a workaround - is that possible or are there too many file types to filter out?
Our crawlers that provide the metrics to OSE are always crawling, but will take about a month for our fix to propagate through to all the pages we crawl. Once we have removed these links from our crawlers, then we'll have to process the metrics. This is why it's looking like late September for the fix to show up.
I really appreciate your patience and understanding, we're doing everything we can to fix it!!
Thanks,
Carin
-
Hey Carin-
Thank you so much for this in-depth response. Glad to hear that you guys are aware of it and trying to sort it out. Very interesting info...I'd never hear of "binary" links before but I hope you guys can figure out how to handle these. Seems like a tough task to tackle, just by looking at my CSV it looks like these come in several different forms and they could be hard to identify..I have a few questions:
1. Is there by chance a URL you could give me that points to the old OSE ?
2. How often does OSE crawl? Is it a constant process or are there scheduled crawls?
Thanks!!
-Zack
-
Hey Zack, I saw the ticket you filed was answered by Aaron, but I just wanted to follow up with you as well. We have made some really exciting changes to the crawler, but, unfortunately, there is a pretty obvious bug as well...
The reason for the “questionable” links coming from the Internet Wild West is due to the crawler reaching much deeper into sites where there are more download (i.e. binary) links. The first issue is the crawler is counting a binary file as a link, but the larger issue, is that the crawler doesn’t really know how to handle these types of files. This bug is causing some links to be improperly associated with certain domains. This is probably what you're seeing with all the crazy links from China and Russia which don't actually link to the site you're researching.
There are two steps to addressing this issue: changing how the crawler sees these file types and then fixing how the crawler handles these file types. We have made improvements to our algorithm so that we will be handle the majority of these files correctly, however, this update will need about a month to propagate. The fix for this issue probably won’t be seen for two more updates, meaning late September. Our improvements should catch most of the issues, but there still could be a few cases we haven't addressed. If this happens, don't hesitate to let us know; we love feedback since it helps us improve and make our index even better!
The next step is to fix how our crawlers handle binary file links and prevent them from being improperly associated with certain domains. We are in the process of working through that issue right now. We’re doing everything we can to resolve this bug as we know it is alarming to see these “questionable” links associated with your sites.I hope this helps and thanks so much for being patient :)Thanks,Carin
-
2 ways:
- Get as CSV and spend the time going through it
- Wait it out
-
OK cool good info, hope they fix it soon!! Any good ideas on how you can filter this crap[ out ?
-
Hello Zack,
That is an issue that they are working on, I know this because I already discussed this with one of their help desk people. Here is the page that describes the changes: http://www.seomoz.org/blog/brand-new-open-site-explorer-is-here
In addition to that, here is some additional information I can share with you:
you may see “questionable” links with weird file extensions. This is due to the crawler reaching much deeper into sites where there are more download links. We are looking into fixing this bug as soon as we can so these won’t be counted as links.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Links from Dofollow sites are Nofollow ?
Hii i have made few links from dofollow site through blog commenting method and when i checked my site on Open Site Explorer it shows Nofollow . So my question is links from Dofollow blogs will be nofollow ? There is any benefit of getting links from dofollow site through comments ? Thanks
Moz Pro | | sumit40 -
New pages on my web site
I have created web sites that appear somewhere on Google in hardly any time at all, but I appear to have forgotten something or things are different for pages added recently to an existing website. I have added a page on a particular subject, optimized it using on page grader, so that I get an A, and a check mark for everything except H1 tags and rel=canonical which my web hosting provider does not support. I do have a check mark for accessible to search engines The page has the format http://www.domain.com/specific-keyword It is in the menu, so should have internal links to it, as I understand it. I have created a new site map, and submitted it in webmaster tools. Interestingly it says that of the 96 pages only 76 were indexed is this a clue? and why would they not index a page I have then shared the page on google plus, facebook, tumblr, pinterest and twitter and some others In OSE it comes up as domain authority 28 page authority 1, the social media shares do show up in metrics on the right but no links internal or external are shown, they do on other pages I created in the same way. Is it just a case of waiting or is their something I do to help thank you
Moz Pro | | singingtelegramsuk0 -
How can I increase the number of links downloaded with the Opensite Explorer Just-Discovered tool?
I recently just tried to download the list from the OpenSite Explorer Just-Discovered tool and it only downloaded 233 links. When I checked to see how many URLs were available it was over 4,000. It does seem as though it only downloaded today's links although I'm not quite sure about that. Is there a way to increase the number of links downloaded to match what is listed in Opensite Explorer? Thanks.
Moz Pro | | searchysearchy0 -
Where do I post this list of hacked sites?
Hey guys, Fairly new to SEOmoz but loving it so far. I was working on a new clients site a noticed some spammy links added right before the tag. Used Open site explorer to list the domains linking to the url and found nearly 300 unsuspecting domains. Some like heartresearch.com.au which just drives me craaazy, I have already emailed them. Below is the list. http://www.opensiteexplorer.org/links.html?group=0&page=3&site=www.rhcie.com Short of emailing every single person can anyone suggest a forum or such that would be helpful for posting this information ? I know it's just a few links but it is frustrating to me and If I can do something about it I would like to. Thanks in advance. Jason
Moz Pro | | RedshiftWebDesign0 -
How to get the total external links to a page and total external links to the domain using Mozscape API? I could not see an option in the bit flags
Hi, I was trying to get the data for total external links to a page and total external links to the domain using Mozscape API but I can't see a bit flag which can do that. There are bit flags for external followed links to a page and external followed links to a domain but I wanted the total external links data, is there a way to do that using Mozscape API else I would end up copying the data manually from OSE which would be cumbersome and time consuming. Your help is highly appreciated.
Moz Pro | | HQP0 -
In my crawl diagnostics, there are links to duplicate content. How can I track down where these links originated in?
How can I find out how SEOMOz found these links to begin with? That would help fix the issue. Where's the source page where the link was first encountered listed at?
Moz Pro | | kirklandsl0 -
Why does the csv export from OSE only include 25 of the 90+ links?
In OpenSiteExplorer, I clicked "Download CSV" for a report on backlinks from one domain to another. The online visualization in OSE showed 93 external inbound links from site A to site B. When I opened the report, there are only 25 linking pages listed. How do I download the full list?
Moz Pro | | DanielH0 -
OSE Backlink results - reported link not actually there?
Not a complaint, but a question to understand how the research tool operates: When I run backlink checks on websites, often the reported link is not only not on the page, but it's not found anywhere on the site. I use several tools to search for the link url as well as for the keyword. Why does the tool report a link is there, but I cannot find the links in some cases? Is there a lag in the information the tool is using, making it not quite up to date, or is it something else? Thanks much!
Moz Pro | | AdamThompson0