404'd pages still in index

mj775

I recently launched a site and shortly after performed a URL rewrite (not the greatest idea, i know). The developer 404'd the old pages instead of a permanent 301 redirect. This caused a mess in the index. I have tried to use Google's removal tool to remove these URL's from the index. These pages were being removed but now I am finding them in the index as just URL's to the 404'd page (i.e. no title tag or meta description). Should I wait this out or now go back and 301 redirect the old URL's (that are 404'd now) to the new URL's? I am sure this is the reason for my lack of ranking as the rest of my site is pretty well optimized and I have some quality links.

mj775

Will do. Thanks for the help.

JaspalX

I think the latter - robot and 301.

but (if you can) leave a couple without 301 and see what (if any) difference you get - would love to hear how it works out.

mj775

Is it better to remove the robots.txt entries that are specific to the old URL's so Google can see the 404 so Google will remove those pages at their own pace or remove those bits of the robots.txt file specific to the old URL's and 301 them to the new URL's. It seems those are my two options....? Obviously, I want to do what is best for the site's rankings and will see the fastest turnaround. Thanks for your help on this by the way!

JaspalX

I'm not saying remove the whole robots.txt file - just the bits relating to the old urls (if you have entries in a robots.txt that affect the old urls).

e.g. say you're robots.txt blocks access to

mysite.com/old/url/format/

then you should remove that line from the robots.txt otherwise google won't be able to crawl those pages to 'see' the 404 and realise that they're not there.

My guess is a few weeks before it all settles down, but that really is a finger in the air guess. I went through a similar scenario with moving urls and then moving them again shortly after the first move - took a month or two.

mj775

I am a little confused regarding removal of the robots.txt file since that is a step in requesting removal from google (per their removal tool requirements). My natural tendency is to 301 redirect the old URL's to the new ones. Will I need to remove the robots.txt file prior to permanently redirecting the old URL's to the new ones? How long does it take Google (estimate) to remove old URL's after a 301?

JaspalX

Ok, got that, so that sounds like an external rewrite - which is fine. url only, but no title or description - that sounds like what you get when you block crawling via robots.txt - if you've got that situation, I'd suggest removing the block so that google can crawl them and find that they are 404s. Sounds like they'll fall out of the index eventually. Another thing you could try to hurry things along is: 301 the old urls to the new ones. submit a sitemap containing the old urls (so that they get crawled and the 301s are picked up) update your sitemap and resubmit with only the new urls.

mj775

When I say URL rewrite, I mean we restructured the URL's to be cleaner and more search friendly. For example, take a URL that was www.example.com/index/home/keyword and structure it to be www.example.com/keyword. Also, the old URL's (i.e. www.example.com/index/home/keyword) are being shows towards the end of the site:example.com search with just the old URL - no title or meta description. Is this a sign that they are on the way out of the index? Any insight would be helpful.

JaspalX

Couple of things probably need clarifying: When you say URL rewrite, I'm assuming you mean an external rewrite (in effect, a redirect)? If you do an internal rewrite, that (of itself) should make no difference at all to how any external visitors/engines see your urls/pages. If the old pages had links or traffic I would be inclined to 301 them to the new pages. If the old pages didn't have traffic/links, leave them, they'll fall out eventually - they're not in an xml sitemap by any chance are they (in which case update the sitemap). You often see a drop in rankings when restructuring a site and (in my experience), it can take a few weeks to recover. To give you an example, it took nearly two months for the non-www version of our site to disappear from the index after a similar move (and messing about with redirects).

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

404'd pages still in index

Browse Questions

Explore more categories

Related Questions

What to do when your home page an index for a series of pages.

Client has moved to secured https webpages but non secured http pages are still being indexed in Google. Is this an issue

Thinking about not indexing PDFs on a product page

Interlinking vs. 'orphaning' mobile page versions in a dynamic serving scenario

How to fix Invalid Product Page registering as Soft 404

Sudden Change In Indexed Pages

Best solution to get mass URl's out the SE's index

Should I Allow Blog Tag Pages to be Indexed?