[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console
-
I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too):
Here's how they show up:
Example 1:
- URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2
- Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html
Example 2 (this surprised me the most when I looked at the linked from data):
-
URL: www.example.com/search/adult-site-keyword-2.html/page/3/
-
Linked From:
-
www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site)
-
http://a-spammy-adult-site.com/search/adult-site-keyword-2.html
Example 3:
- URL: www.example.com/search/adult-site-keyword-3.html
- Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html
How do I address this issue?
-
Here is what I would do
-
Disavow the domain that is linking to you from the adult site(s).
-
The fact that Google search console is showing that you have an internal page linking as well makes me want to know a) have you always owned this domain and maybe someone previously did link internally like this or b) you may have been or are hacked
In the case of b) this can be really tricky. I once had a site that in a crawl it was showing sitewide links to various external sites that we should not be linking to. When I looked at the internal pages via my browser, there was no link as far as I could see even though it showed up on the crawler report.
Here was the trick. The hacker had setup a script to only show the link when a bot was viewing the page. Plus, we were running mirrored servers and they had only hacked one server. So, the links only showed up when you were spidering a specific mirrored instance as a bot.
So thanks to the hacking, not only were we showing bad links to bad sites, we were doing this through cloaking methodology. Two strikes against us. Luckily we picked this up pretty quick and fixed immediately.
Use a spidering program or browser program to show a user agent of Googlebot and go visit your pages that are linking internally. You might be surprised.
Summary
Googlebot has a very long memory. It may be that this was an old issue that was fixed long ago. If that was the case, just show the 404s for the pages that do not exist, and disavow the bad domain and move on. Make sure that you have not been hacked as this would also be why this is showing.
Regardless, the fact that Google did find it at one point, you need to make sure you resolve. Pull all the URLs into a spreadsheet and run Screaming Frog in list mode to check them all to make sure you fix all of it.
-
-
Yep.. Looking if anyone can help with this..
-
Oh yea, I missed that. That's very strange, not sure how to explain that one!
-
Thanks for the response Logan. What you are saying definitely makes sense.. But it makes think why do I see something like Example 2 under Crawl errors. Why Google Search Console shows linked from as 2 URL - one the spammy site's and other is from my own website. How is that even possible?
-
I've seen similar situations, but never in bulk and not with adult sites. Basically what's happening is somehow a domain (or multiple) are linking to your site with inaccurate URLs. When bots crawling those sites find the links pointing to yours, they obviously hit a 404 page which triggers the error in Search Console.
Unfortunately, there's not too much you can do about this, as people (or automated spam programs) can create a link to any site and any time. You could disavow links from those sites, which might help from an SEO perspective, but it won't prevent the errors from showing up in your Crawl Error report.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Syntax: 'canonical' vs "canonical" (Apostrophes or Quotes) does it matter?
I have been working on a site and through all the tools (Screaming Frog & Moz Bar) I've used it recognizes the canonical, but does Google? This is the only site I've worked on that has apostrophes. rel='canonical' href='https://www.example.com'/> It's apostrophes vs quotes. Could this error in syntax be causing the canonical not to be recognized? rel="canonical"href="https://www.example.com"/>
Intermediate & Advanced SEO | | ccox10 -
Search engine submission - Urgent
Is it necessary to submit a new site to search engines? I have a brand-new site I purchased a few days ago which I didn't think to check until after I purchased it, But it has not been indexed by Google!
Intermediate & Advanced SEO | | seoman10
The domain was registered three months ago, and probably the website wouldn't have been designed until after that.
But I'm still left puzzling why the site is not indexed by Google. Any ideas? Thanks in advance.0 -
Keywords Ranking Varies When Search changes Location/City (Not Google Places)
We have a client that are ranking well on most Australian cities for competitive keywords except Google Sydney. If you toggled the cities on the search field when you search for a keyword, their places are almost exactly the same except for Sydney on which they can't be found at all in the Top 100 results. The keywords are not city specific, they are general commonly searched keywords about health. This is not a Google Places issue. The search result shows the right landing pages of the site for their respective keywords. Any ideas or experience on this kind of situation. Much appreciated Louie
Intermediate & Advanced SEO | | louieramos0 -
Should I let Google crawl my production server if the site is still under development?
I am building out a brand new site. It's built on Wordpress so I've been tinkering with the themes and plug-ins on the production server. To my surprise, less than a week after installing Wordpress, I have pages in the index. I've seen advice in this forum about blocking search bots from dev servers to prevent duplicate content, but this is my production server so it seems like a bad idea. Any advice on the best way to proceed? Block or no block? Or something else? (I know how to block, so I'm not looking for instructions). We're around 3 months from officially launching (possibly less). We'll start to have real content on the site some time in June, even though we aren't planning to launch. We should have a development environment ready in the next couple of weeks. Thanks!
Intermediate & Advanced SEO | | DoItHappy0 -
How do I best deal with pages returning 404 errors as they contain links from other sites?
I have over 750 URL's returning 404 errors. The majority of these pages have back links from sites, however the credibility of these pages from what I can see is somewhat dubious, mainly forums and sites with low DA & PA. It has been suggested placing 301 redirects from these pages, a nice easy solution, however I am concerned that we could do more harm than good to our sites credibility and link building strategy going into 2013. I don't want to redirect these pages if its going to cause a panda/penguin problem. Could I request manual removal or something of this nature? Thoughts appreciated.
Intermediate & Advanced SEO | | Towelsrus0 -
How would you handle 12,000 "tag" pages on Wordpress site?
We have a Wordpress site where /tag/ pages were not set to "noindex" and they are driving 25% of site's traffic (roughly 100,000 visits year to date). We can't simply "noindex" them all now, or we'll lose a massive amount of traffic. We can't possibly write unique descriptions for all of them. We can't just do nothing or a Panda update will come by and ding us for duplicate content one day (surprised it hasn't already). What would you do?
Intermediate & Advanced SEO | | M_D_Golden_Peak1 -
URL Structure - Keywords vs. Information Architecture/Navigation
I'm creating the URL structure for an ecommerce site and was wondering if it's better to structure my URLs according to the most popular way people word their key phrases or by what makes most sense from a navigation perspective. Let's say I'm selling clothing (I'm not, just an example). I want the site to be open enough so a user can navigate by Person Type (Men's, Women's, Children's), Clothing Type (Shoes, Shirts, Hats), and Brands (Nike, Reebok, adidas). My gut and past experience say to structure the URLs from the least specific to the most specific: mysite.com/mens/shoes/nike But I know "men's Nike shoes" is searched for more than "men's shoes Nike", which would render this URL: mysite.com/mens/nike/shoes I know mysite.com/mens-nike-shoes would be best, but the folders setup is what I have to work with. So which is best for SEO? URLs that play to the structure of the most searched for key phrases? Or URLs that follow the information architecture/navigation of a site? Nate
Intermediate & Advanced SEO | | rball10 -
Domain w/ Identical Content to Site we are Optimizing
Hi Guys, We've been optimizing a client's site for about a year or so now and on a call the other day the client brought up that he owns and operates another site that's marketing the same product, but to a difference audience (we work on the direct to consumer side, this is a distributior focused site),with the same exact content as the site we are optimizing. Obviously this is a major duplcant content issue and we need to get it resolved very quickjly. We have already reccomendt to the client that we re-write content, but this is where my questions comes in - Which site should we rewrite the content on? The site we are optimizing is the more impoorant of the two, while we still want the other site to hold rankings we dont want to end up accidently optimizing the other site wherein the site we are working on full time suffers a lost when a "compeiting" site creates compeltely new content and may, accidentally, end up ranking higher than the site we are focusing on full time. As links also play a role, would that be a KPI to look at here in determining which site gets new content and which does not? In this scenairo, would would you guys recommend? Just want to make sure I'm dotting all my I's, and crossing T's here. Many thanks to all in advance, Mike
Intermediate & Advanced SEO | | Havas_Disco0