Old URLs that have 301s to 404s not being de-indexed.
-
We have a scenario on a domain that recently moved to enforcing SSL. If a page is requested over non-ssl (http) requests, the server automatically redirects to the SSL (https) URL using a good old fashioned 301. This is great except for any page that no longer exists, in which case you get a 301 going to a 404.
Here's what I mean.
Case 1 - Good page:
http://domain.com/goodpage -> 301 -> https://domain.com/goodpage -> 200
Case 2 - Bad page that no longer exists:
http://domain.com/badpage -> 301 -> https://domain.com/badpage -> 404
Google is correctly re-indexing all the "good" pages and just displaying search results going directly to the https version.
Google is stubbornly hanging on to all the "bad" pages and serving up the original URL (http://domain.com/badpage) unless we submit a removal request. But there are hundreds of these pages and this is starting to suck. Note: the load balancer does the SSL enforcement, not the CMS. So we can't detect a 404 and serve it up first. The CMS does the 404'ing.
Any ideas on the best way to approach this problem? Or any idea why Google is holding on to all the old "bad" pages that no longer exist, given that we've clearly indicated with 301s that no one is home at the old address?
-
I don't think 404 vs 410 is the answer here.The basis for this thought is the following:
========
"if we see a page and we get a 404, we are gonna protect that page for 24 hours in the crawling system, so we sort of wait and we say maybe that was a transient 404, maybe it really wasn’t intended to be a page not found.”
“If we see a 410, then the site crawling system says, OK we assume the webmasters knows what they’re doing because they went off the beaten path to deliberately say this page is gone,” he said. “So they immediately convert that 410 to an error, rather than protecting it for 24 hours."
========
I'm thinking the deeper issue is why the 301s are not being respected. If a link points to http://domain.com/badpage and we use a 301 to point to https://domain.com/badpage - shouldn't the crawler (Google or otherwise) respect the 301? Why still index and serve up a page that responds with the 301? To me, this is baffling. If we serve up a 404 or a 410 - either way we are saying "this page is gone" but we're still seeing the original http://domain.com/badpage in the index?
Does that make sense? Or is there more clarification required?
-
sym_admin is right--you'll want to find the source of those pages, as Google apparently is seeing them from somewhere and still requesting them. If there are links to those pages somewhere, you will need to remove them. Also, if you're able, I would change those URLs so that they serve up a "410 Gone" error, and not a 404.
-
Read these three, then do what you got to do...
https://www.searchcommander.com/how-to-bulk-remove-urls-google/
https://productforums.google.com/forum/#!topic/webmasters/uYFJnsyiH8w
https://mza.seotoolninja.com/community/q/404-redirects-to-the-homepage-is-this-good-bad-ugly
For proper removal, please ensure that there are no INTERNAL links anywhere on your website to 404 addresses, from sitemap, buttons, text, or images (the whole 9 yards).
Good luck!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When I crawl my website I have urls with (#!162738372878) at the end of my urls
When I crawl my website I have urls with (#!162738372878) at the end of my urls. I used screaming frog to look check my website and I seen these. My normal urls are in there too, but each of them have a copy with this strange symbol and number at the end. I used a website builder called homestead to make the website and I seen a bunch of there urls in my crawl as well - http://editor.homestead.com/faq is an example I recently created a new website with their new website builder and transferred it to my old domain. However, I didnt know they didnt offer 301 redirects or canonical tags(learned about those afterwards) and I changed my page names. So they recommended I leave the old website published along with the new website. So if I search my website name on google, sometimes both will show in the results. I just want to sort this all out somehow. My website is www.coastlinetvinstalls.com Any feedback is greatly appreciated. Thanks, Matt
Intermediate & Advanced SEO | | Matt160 -
301 redirect to a temporary URL
Hi there, What would happen if I redirected a set of URLs to a temporary URL structure. And then a few weeks later redirected the original URLs and temporary URLs to the final permanent URLs? So for example:A -> B for a few weeks.
Intermediate & Advanced SEO | | sichristie
then: A->C and B->C where:
C is the final destination URL.
B is the temporary destination
A is the original URL. The reason we are doing this is the naming of the URLs and pages are different, and we wish to transition our customers carefully from old to new. I am looking for a pure technical response.
Would we lose link juice? Does Google care if we permanently redirect to a set of 'temporary' URLs, and then permanently redirect to a set of what we think are permanent URLs? Cheers, Simon0 -
Certain Product Pages Not Indexing
Hey All, We discovered an issue where new product pages on our site were not getting indexed because a "noindex" tag was inadvertently being added to section when those pages were created. We removed the noindex tag in late April and some of the pages that had not been previously indexed are now showing up, but others are still not getting indexed and I'd appreciate some help on why this could be. Here is an example of a page that was not in the index but is now showing after removal of noindex: http://www.cloud9living.com/san-diego/gaslamp-quarter-food-tour And here is an example of a page that is still not showing in the index: http://www.cloud9living.com/atlanta/race-a-ferrari UPDATE: The above page is now showing after I manually submitted it in WMT. I had previously submitted another page like a month ago and it was still not indexing so I thought the manual submission was a dead end. However, it just so happens that the above URL just had its Page Title and H1 updated to something more specific and less duplicative so I am currently running a test to see if that's the problem with these pages not indexing. Will update this soon. Any suggestions? Thanks!
Intermediate & Advanced SEO | | GManSEO0 -
Page Indexed but not Cached
A section of pages on my site are indexed (I know because they appear in SERPs if I copy and paste a sentence from the content), however according to the text-only cached version of the page they are not being read by Google.Why are they indexed event hough it seems like Google is not reading them..... or is Google in fact reading this text even though it seems like they should not be?Thanks for your assistance.
Intermediate & Advanced SEO | | theLotter0 -
Optimal URLs for SEO and UX
We are considering restructuring the URL scheme on one of the websites we maintain. We have a few options. Currently news article URLs are as follows:
Intermediate & Advanced SEO | | Peter264
http://domain.com/news/1234/article-title-name/ Download section URLs are as follows:
http://domain.com/downloads/files/1234/file-title-of-download-here/ Forum URLS:
http://forum.domain.com/forum/topic/1234/title-of-forum-topic-here/ We feel that these are a bit too long for both SEO and user experience. We want to remove as many directories from the URLs as possible. From experience, what do you recommend changing for the example URLs above? We have some ideas below...and we need to keep the ID in the URLs...however I know this is a little frustrating. Some ideas we have for news articles:
http://domain.com/news/article-title-shorter-1234
http://domain.com/article-title-shorter-n1234 Some ideas for the download pages:
http://domain.com/downloads/file-title-shorter-d1234
http://domain.com/downloads/files/file-title-shorter-1234
http://domain.com/file-title-shorter-d1234 Some ideas for the forum URLs:
http://forum.domain.com/topic-title-shorter-t1234
http://forum.domain.com/topic/topic-title-shorter-1234 What do you think of these suggestions? Any other URL ideas? Recommended URL length? The purpose of is question was to find the perfect URLs for the site we are working on; your thoughts, suggestions and tips are very much appreciated.0 -
URL for offline purposes
Hi there, We are going to be promoting one of our products offline, however I do not want to use the original URL for this product page as it's long for the user to type in, so I thought it would be best practice in using a URL that would be short, easier for the consumer to remember. My plan: Replicate the product page and put it on this new short URL, however this would mean I have a duplicate content issue, would It be best practice to use a canonical on the new short URL pointing to the original URL? or use a 301? Thanks for any help
Intermediate & Advanced SEO | | Paul780 -
Does URL format affect Keyword effectiveness for a URL?
I am looking at our site structure, and don't want to have to rebuild the way the site was linked together based on it's current folder structure so I am wondering what option would work better for our URL structure. I will uses car categories as an example of what I am talking about, but you can insert any category structure you like. For example I would like to have pages like this: www.example.com/ford-convertibles
Intermediate & Advanced SEO | | SL_SEM
www.example.com/chevy-convertibles But instead due to the site structure I will need to have pages like this: www.example.com/ford/convertibles
www.example.com/chevy/convertibles But wonder if I shouldn't do the following to ensure the proper phrase is known for the page: www.example.com/ford/ford-convertibles
www.example.com/chevy/chevy-convertibles The "/ford/ford-convertibles" just seems odd to me as a human, but I haven't seen anything on how well a keyphrase in a URL split by /'s does and I know dashes for phrases are fine. This means I am inclined to go with the"/ford/ford-convertibles"style because it keeps the keyphrase separated by dashes even if it is a bit repetitive. There will be other pages too like "/ford/top-10-fords-ever" but I don't wonder about that since it isnt "ford/ford-xxxxx" Thoughts on whether /'s in a keyphrase are as good as dashes?0 -
Url with hypen or.co?
Given a choice, for your #1 keyword, would you pick a .com with one or two hypens? (chicago-real-estate.com) or a .co with the full name as the url (chicagorealestate.co)? Is there an accepted best practice regarding hypenated urls and/or decent results regarding the effectiveness of the.co? Thank you in advance!
Intermediate & Advanced SEO | | joechicago0