Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Difference between urls and referring urls?
Sorry, nit new to this side of SEO We recently discovered we have over 200 critical crawler issues on our site (mainly 4xx) We exported the CSV and it shows both a URL link and a referring URL. Both lead to a 'page not found' so I have two questions? What is the difference between a URL and a referring URL? What is the best practice/how do we fix this issue? Is it one for our web developer? Appreciate the help.
Moz Pro | | ayrutd1 -
Youtube traffic page url referral
Hello, How can I see which videos from Youtube that has my domain inserted in their description url drive traffic to my domain? I can see in GA how many visitors are coming from Youtube to my domain, but I can't see what Youtube video pages has driven traffic. Any help?
Moz Pro | | xeonet320 -
Why is OSE showing no data for this URL?
Hi all, Does anyone have any ideas as to why OSE might not have any data for this URL: http://www.ccisolutions.com/StoreFront/product/shure-slx24-sm58-wireless-microphone-system-j3 It is not a new page at all. It's been on the site for years. Is OSE being quirky? Or is there an underlying problem with this page? Thanks in advance for any light you can shed on this, Dana
Moz Pro | | danatanseo0 -
URL, Subdomain and Root Domain Structure
Various URL Structure
Moz Pro | | Mark_Ch
mydomain.co.uk
www.mydomain.co.uk
http://www.mydomain.co.uk
http://mydomain.co.uk
mydomain.co.uk/index.html
www.mydomain.co.uk/index.html
http://www.mydomain.co.uk/index.html
http://mydomain.co.uk/index.html HTACCESS File Index Rewrite RewriteRule ^index.(htm|html|php) http://www.mydomain.co.uk/ [R=301,L]
RewriteRule ^(.)/index.(htm|html|php) http://www.mydomain.co.uk/$1/ [R=301,L]
RewriteCond %{HTTP_HOST} ^mydomain.co.uk
RewriteRule ^(.)$ http://www.mydomain.co.uk/$1 [R=301,L] Google WMT Setting: Configuration | Settings
Preferred domain: radio check on "don't set a preferred domain" SEOMoz Open Site Explorer
mydomain.co.uk - (301 Redirect) [No Data] PA38 DA30
http://www.mydomain.co.uk/index.html - (301 Redirect) [No Data] PA23 DA30 Majestic Site Explorer
Number of Referring Domains & External Backlinks vary between the following instances:
URL: http://www.mydomain.co.uk
SUBDOMAIN: www.mydomain.co.uk
ROOT DOMAIN: mydomain.co.uk
Question
I have set up my htaccess file to rewrite "Various URL Structure" to www.mydomain.co.uk. However when i view metrics in Majestic SEO, the url / Subdomain / Root Domain all differ. Why is this happening?
Is this harming my site?
What is common practice when defining URL Structure? Any other quality advise and implementation structure would be much appreciated. Regards Mark0 -
SEOmoz crawler bug?
I just noticed that a few of my campaigns have number of pages crawled 1. Can someone tell me what this is.... from 5 campaigns 2 have only one pages crawled from which one is an online shop with over 2000 products 🙂
Moz Pro | | mosaicpro0 -
Were do I find the old Link Scape report that shows Domain Juice passed?
It looks like they are retiring the old Link scape. Where do we find the report that showed the Domain Juice passed by each link? Is it in the advanced reports and we need to request and wait?
Moz Pro | | MBayes0 -
Crawler reporting incorrect URLs, resulting in false errors...
The SEOmoz crawler is showing 236 Duplicate Page Titles. When I go in to see what page titles are duplicated I see that the URLs in question are incorrect and read "/about/about/..." instead of just "/about/" The shown page duplicates are the result of the crawler is ending up on the "Page not found" page. Could it be the result of using relative links on the site? Anything I can do to remedy? Thanks for your help! -Frank
Moz Pro | | Clements1 -
Why is the crawler following form action links?
I have an issue with one of my sites where the SEOMoz crawler is following some form action links. It is my understanding that the crawler will ignore these links. Why would it not be ignoring them in certain cases. If you need more detail, please ask. Thanks.
Moz Pro | | AmberHanson0