Crawl Tool Producing Random URL's
-
For some reason SEOmoz's crawl tool is returning duplicate content URL's that don't exist on my website. It is returning pages like "mydomain.com/pages/pages/pages/pages/pages/pricing" Nothing like that exists as a URL on my website. Has anyone experienced something similar to this, know what's causing it, or know how I can fix it?
-
The same thing is happening for one of my campaigns, specifically for a 302 redirect to the homepage. My guess is I need to update it to a 301, but I'm not 100% sure if that would solve the issue?
-
Well we have our website setup to where if you type something after mydomain.com/ that is not a valid URL it will take you to the site map. For example if I typed in mydomain.com/ljlksdfsdfkjsdlfjsflj it would take me to the site map. The same holds true for mydomain.com/pages/pages/pages/pages/pages/pricing. So to answer your question, no it does not take you to the correct page, but it doesn't give you a 404 error either. It takes you to the site map.
-
I had this weird issue too but it was down to how our developers created the website. Does "mydomain.com/pages/pages/pages/pages/pages/pricing" still take you to the correct page or does it show you a 404 error?
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
The importance of url's - are they that important?
Hi Guys I'm reading some very contrasting and confusing reviews regarding urls and the impact they have on a sites ability to rank. My client has a number of flooring products, 71 to be exact - categorised under three sub categories 1. Gallery Wood - 2. Prefinshed Wood - 3. Parquet & Reclaimed. All of the 71 products are branded products (names that are completely unrelated to specific keyword search terms. This is having a major impact regarding how we optimise the site. FOR EXAMPLE: A product of the floor called "White Grain" - the "Key Word" we would like to rank this page for is Brown Engineered Flooring. I'm interested to know, should the name of the branded product match the url? What would you change to help this page rank better for the keyword - Brown Engineered Flooring. Title page: White Grain Url: thecompanyname.com/gallery-wood/white-grain (white grain is the name of the product) Key Word: Brown Engineered Flooring **Seo Title: **White Grain, Brown Engineered Flooring by X Meta Description: BLAH BLAH Brown Engineered Flooring BLAH BLAH Any feedback to help get my head around this would be really appreciated. Thank you.
Technical SEO | | GaryVictory0 -
What's the best Blogging platform
A year ago an SEO specialist evaluated my Wordpress site and said she had seen lower rankings for Wordpress sites--in general. We moved our site off any cms and design in html 5. Our blog, however, is still on Wordpress. I'm thinking about moving to the Ghost platform b/c I only a blog. The drawbacks are one author, no recent post lists, no meta tags. Is it worth it to move the site off Wordpress. Will it affect my rankings much if I have great content? Does anyone have experience with or opinions on Ghost?
Technical SEO | | RoxBrock0 -
Specific question about pagination prompted by Adam Audette's Presentation at RKG Summit
This question is prompted by something Adam Audette said in this excellent presentation: http://www.rimmkaufman.com/blog/top-5-seo-conundrums/08062012/ First, I will lay out the issues: 1. All of our paginated pages have the same URL. To view this in action, go here: http://www.ccisolutions.com/StoreFront/category/audio-technica , scroll down to the bottom of the page and click "Next" - look at the URL. The URL is: http://www.ccisolutions.com/StoreFront/IAFDispatcher, and for every page after it, the same URL. 2. All of the paginated pages with non-unique URLs have canonical tags referencing the first page of the paginated series. 3. http://www.ccisolutions.com/StoreFront/IAFDispatcher has been instructed to be neither crawled nor indexed by Google. Now, on to what Adam said in his presentation: At about minute 24 Adam begins talking about pagination. At about 27:48 in the video, he is discussing the first of three ways to properly deal with pagination issues. He says [I am somewhat paraphrasing]: "Pages 2-N should have self-referencing canonical tags - Pages 2-N should all have their own unique URLs, titles and meta descriptions...The key is, with this is you want deeper pages to get crawled and all the products on there to get crawled too. The problem that we see a lot is, say you have ten pages, each one using rel canonical pointing back to page 1, and when that happens, the products or items on those deep pages don't get get crawled...because the rel canonical tag is sort of like a 301 and basically says 'Okay, this page is actually that page.' All the items and products on this deeper page don't get the love." Before I get to my question, I'll just throw out there that we are planning to fix the pagination issue by opting for the "View All" method, which Adam suggests as the second of three options in this video, so that fix is coming. My question is this: It seems based on what Adam said (and our current abysmal state for pagination) that the products on our paginated pages aren't being crawled or indexed. However, our products are all indexed in Google. Is this because we are submitting a sitemap? Even so, are we missing out on internal linking (authority flow) and Google love because Googlebot is finding way more products in our sitemap that what it is seeing on the site? (or missing out in other ways?) We experience a lot of volatility in our rankings where we rank extremely well for a set of products for a long time, and then disappear. Then something else will rank well for a while, and disappear. I am wondering if this issue is a major contributing factor. Oh, and did I mention that our sort feature sorts the products and imposes that new order for all subsequent visitors? it works like this: If I go to that same Audio-Technica page, and sort the 125+ resulting products by price, they will sort by price...but not just for me, for anyone who subsequently visits that page...until someone else re-sorts it some other way. So if we merchandise the order to be XYZ, and a visitor comes and sorts it ZYX and then googlebot crawls, google would potentially see entirely different products on the first page of the series than the default order marketing intended to be presented there....sigh. Additional thoughts, comments, sympathy cards and flowers most welcome. 🙂 Thanks all!
Technical SEO | | danatanseo0 -
Webmaster Tools - Clarification of what the top directory is in a calender url
Hi all, I had an issue where it turned out a calender was used on my site historically (a couple of years ago) but the pages were still present, crawled and indexed by google to this day. I want to remove them now from the index as it really clouds my analysis and as I have been trying to clean things up e.g. by turning modules off, webmaster tools is throwing up more and more errors due to these pages. Below is an example of the url of one of the pages: http://www.example.co.uk/index.php?mact=Calendar,m1a033,default,1&m1a033year=2084&m1a033month=3&m1a033returnid=59&page=59?phpMyAdmin=xxyyzz The closest question I have found on the topic in Seomoz is: http://www.seomoz.org/q/duplicate-content-issue-6 I want to remove all these pages from the index by targeting their top level folder. From the historic question above would I be right in saying that it is: http://www.example.co.uk/index.php?mact=Calendar I want to be certain before I do a directory level removal request in case it actually targets index.php instead and deindexes my whole site (or homepage at the very least). Thanks
Technical SEO | | Mitty0 -
Blank pages in Google's webcache
Hello all, Is anybody experiencing blanck page's in Google's 'Cached' view? I'm seeing just the page background and none of the content for a couple of my pages but when I click 'View Text Only' all of teh content is there. Strange! I'd love to hear if anyone else is experiencing the same. Perhaps this is something to do with the roll out of Google's updates last week?! Thanks,
Technical SEO | | A_Q
Elias0 -
What's the website that analyzes all local business submissions?
I was recently looking at a blog post here or a webinar and it showed a website where you could see all of the local sites (yelp, Google places) where your business has been submitted. It was an automated tool. Does anyone remember the name of the site?
Technical SEO | | beeneeb0 -
What's the best way to transplant a blogger blog to another domain?
So I have this client who's got a killer blogger blog—tons of inbound links, great content, etc. He wants to move it onto his new website. Correct me if I'm wrong, but there isn't a single way to 301 the darn thing. I can do meta refresh and/or JavaScript redirects, but those won't transfer link juice, right? Is there a best practice here? I've considered truncating each post and adding a followed "continue reading…" link, which would of course link to the full post on the client's new site. It would take a while and I'm wondering if it would be worth it, and/or if there are any better ideas out there. Sock it to me.
Technical SEO | | TheEspresseo0 -
Directory URL structure last / in the url
Ok, So my site's urls works like this www.site.com/widgets/ If you go to www.site.com/widgets (without the last / ) you get a 404. My site did no used to require the last / to load the page but it has over the last year and my rankings have dropped on those pages... But Yahoo and BING still indexes all my pages without the last / and it some how still loads the page if you go to it from yahoo or bing, but it looks like this in the address bar once you arrive from bing or yahoo. http://www.site.com/404.asp?404;http://site.com:80/widgets/ How do I fix this? Should'nt all the engines see those pages the same way with the last / included? What is the best structure for SEO?
Technical SEO | | DavidS-2820610