Why do old URL format are still being crawled by Rogerbot?
-
Hi,
In the early days of my blog, I used permalinks with the following format:
http://www.mysitesamp.com/2009/02/04/heidi-cortez-photo-shoot/
I then decided to change this format using .htaccess to this format:
http://www.mysitesamp.com//heidi-cortez-photo-shoot/
My question is, why do rogerbot still crawls my old URL format since these urls' no longer exists in my website or blog.
-
Thanks Alan,
That solved my problem...
-
-
Hi Alan,
After disallowing the directory in robots.txt, Rogerbot still includes the non-existing URLs. Here is a sample URL that is being reported by Rogerbot
www.lugaluda.com/2009/08/05/chase-online-banking-chase-checking-bonus/
-
If you give me the url, i can crawl it fior you if you like.
-
Thanks Alan, I really appreciate your help. Gave me an idea since all the old URLs are coming from a virtual 2009 directory, I tried to add a disallow statement for that directory in the robots.txt section. Hopefully this will help solve the problem.
I will let you know the results after rogerbot finishes recrawling my site...
Thanks Dude....
-
You need to search your site, but bots start on a page and follow the links, if the report them then they must of found them, bots like googlebot or bingbot can find them on other sites, but rogerbot is only crawling within your site.
-
How will I know if they still exists on my site? If I tried to access the specific URLs, they are no longer active.
-
The old format must still exist in your site somewhere, bots follow links from your home page though your site.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What should my main sitemap URL be?
Hi Mozzers - regarding the URL of a website's main website: http://example.com/sitemap.xml is the normal way of doing it but would it matter if I varied this to: http://example.com/mainsitemapxml.xml or similar? I can't imagine it would matter but I have never moved away from the former before - and one of my clients doesn't want to format the URL in that way. What the client is doing is actually quite interesting - they have the main sitemap: http://example.com/sitemap.xml - that redirects to the sitemap file which is http://example.com/sitemap (with no xml extension) - might that redirect and missing xml extension the redirected to sitemap cause an issue? Never come across such a setup before. Thanks in advance for your feedback - Luke
Intermediate & Advanced SEO | | McTaggart0 -
Duplicate content with URLs
Hi all, Do you think that is possible to have duplicate content issues because we provide a unique image with 5 different URLs ? In the HTML code pages, just one URL is provide. It's enough for that Google don't see the other URLs or not ? Example, in this article : http://www.parismatch.com/People/Kim-Kardashian-sa-securite-n-a-pas-de-prix-1092112 The same image is available on: http://cdn-parismatch.ladmedia.fr/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize1-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize2-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize3-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg Thank you very much for your help. Julien
Intermediate & Advanced SEO | | Julien.Ferras0 -
Uppercase in URLs = Dupe Content
Hi Mozzers, My developers recently changed a bunch of the pages I am working on into all lower case (something I know ideally should have been done in the first place). The URLs have sat for about a week as lower case without 301 redirecting the old upper-case URLs to these pages. In Google Webmaster Tools, I'm seeing Google recognize them as duplicate meta tags, title tags, etc. See image: http://screencast.com/t/KloiZMKOYfa We're 301 redirecting the old URLs to the new ones ASAP, but is there anything else I should do? Any chance Google is going to noindex these pages because it seems them as dupes until I fix them? Sometimes I can see both pages in the SERPs if I use personalized results, and it scares me: http://screencast.com/t/4BL6iOhz4py3 Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Are Silos Still Important for SEO?
I am in the process of migrating www.nyc-officespace-leader.com from Drupal to Wordpress and my developer is of the opinion that it is not necessary to implement silos to achieve favorable ranking for competitive keywords. I know a lot has changed in the last two years with Panda and Penguin. Is it SEO best practices to implement silos in the course of the redesign? Will this make a significant difference for SEO? Thanks, Alan Rosinsky
Intermediate & Advanced SEO | | Kingalan10 -
Will our PA be retained after URL updates?
Our web hosting company recently applied a seo update to our site to deal with canonicalization issues and also rewrote all urls to lower case. As a result our PA is now 1 on all pages its effected. I took this up with them and they had this to say. "I must confess I’m still a bit lost however can assure you our consolidation tech uses a 301 permanent redirect for transfers. This should ensure any back link equity isn’t lost. For instance this address: http://www.towelsrus.co.uk/towels-bath-sheets/aztex/egyptian-cotton-Bath-sheet_ct474bd182pd2731.htm Redirects to this page: http://www.towelsrus.co.uk/towels-bath-sheets/aztex/egyptian-cotton-bath-sheet_ct474bd182pd2731.htm And the redirect returns 301 header response – as discussed in your attached forum thread extract" Firstly, is canonicalization working as the number of duplicate pages shot up last week and also will we get our PA back? Thanks Craig
Intermediate & Advanced SEO | | Towelsrus0 -
Crawl Budget on Noindex Follow
We have a list of crawled product search pages where pagination on Page 1 is indexed and crawled and page 2 and onward is noindex, noarchive follow as we want the links followed to the Product Pages themselves. (All product Pages have canonicals and unique URLs) Orr search results will be increasing the sets, and thus Google will have more links to follow on our wesbite although they all will be noindex pages. will this impact our carwl budget and additionally have impact to our rankings? Page 1 - Crawled Indexed and Followed Page 2 onward - Crawled No-index No-Archive Followed Thoughts? Thanks, Phil G
Intermediate & Advanced SEO | | AU-SEO0 -
URL - Keywords
My domain name contains my top two keywords. Am I penalized if I create another page where I add my domain key words a 2nd time after the domain name along with a subcategory and the name of a state. I don't know what white hat and black hat is so I want to make sure I stay white hat. Also I didn't know it but is it true that your title shows up in your domain name?
Intermediate & Advanced SEO | | Boodreaux0 -
Subdirectory URLs
If I have category pages for my site; is it better to use http://example.com/category/category or just http://example.com/category? Also, I'm creating a new section of the site; a resource center. Should the URLs of the pages in the resource center be http://example.com/learn/page or just http://example.com/page What are the reasons for the better choice?
Intermediate & Advanced SEO | | Visually0