Crawlers crawl weird long urls
-
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls.
For example (to be clear):
there is a page: www.website.com/dogs/dog.html
but then it is continuing crawling:
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.htmlwhat can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website
-
Answer from Screaming Frog!
The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL.
It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page.
-
Wow, Big mistakes are made one Home
maybe because of the .aspx. extension? alle pages have seo-friendly urls
Thanks Wesley and Paddy Displays
-
I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx.
It's the bottom left block which causes this link. This way you will get a big nesting effect.
-
OK found one problem
on this page
http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx
you have a link to
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx
which i think should be
-
ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs
You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect.
eg
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx
is being linked from
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx (as well as others)
You just have to follow that trail till you find the source of the problem
-
every link, except the hompage itself
-
I can't see any source:
The pages are like:
| URL | www.website.com/page/ |
| Status Code | 200 |
| Status | OK |
| Type | text/html; charset=utf-8 |
| Size | 55811 |
| Title | |
| Level | 10 |
| In Links | 9 |
| Out Links | 38 | -
Which URL(s) is/are causing problems?
-
please be free to check: http://tinyurl.com/lox7le9
-
You don't necessarily have to remove the link. As long as you can verify that it directs to the right page.
But curious to see what caused the problem
-
I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link.
-
That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch!
-
Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link.
I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.Maybe you can send me the URL and i can take a look at it?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the best way to treat URLs ending in /?s=
Hi community, I'm going through the list of crawl errors visible in my MOZ dashboard and there's a few URLs ending in /?s= How should I treat these URLs? Redirects? Thanks for any help
Moz Pro | | Easigrass0 -
Why do crawlers still track meta keywords if it is not needed in my site?
I have crawled three sites already and it returns more than 5000 errors most of which are MIssing Meta Keywords tags. The sites are on Wordpress and using my SEO plugin I can easily edit the meta keywords of each page, but I am having second thoughts. Well should I?
Moz Pro | | jernest0020 -
Is there a way you can determine the time SEOMoz crawls your website
I'm a new user to SEOMoz Pro, and having created a number of campaigns I wanted to know if you can set / schedule when SEOMoz crawls your website? Ideally I want to prevent SEOMoz from crawling my site at certain times, but I can't seem to find a way of doing this. Thanks in advance.
Moz Pro | | Simon_Glanville0 -
How to set the crawler or reports to ignore
I have a mobile version of a site with a URL string that disables the mobile view on smartphones (view full site), string is like this example.com/page-name.html?mobile=off I need the seomoz pro reports or crawler to ignore it because the crawler visits both versions of the site then reports them as duplicate content. Is there a setting page I haven't visited yet that will set this?
Moz Pro | | Str82u0 -
Crawl Stats Have Dissapeared
Hi SEOmoz I received an email today that another scan has been performed but when I log into my account all the tracking details have disappeared? States Pages crawled N/A. Can someone please help? Temporary problem? Website www.vintageheirloom.com Thanks
Moz Pro | | well-its-1-louder0 -
Title Element too Long - Counts ampersand as &
I was wondering if Google counts the ampersand written as '&' as 5 characters or a single one? It seems that Google counts it as a single character and SEOmoz PRO counts it as 5 characters.
Moz Pro | | DPSSeomonkey0 -
How long does the seomoz crawl take?
It's been doing it's thing for over 48 hours now and Ive got less than 350 pages... is this norma? It's NOT the first crawl.
Moz Pro | | borderbound0 -
Crawl Diagnostics Summary
Is there a way to view the charts in the crawl diagnostics summary on a monthly view (or export the monthly figures)?
Moz Pro | | RikkiD220