Unexplained Crawl Diagnostic Errors & Opencart
-
Hi,
I've been looking at the crawl diagnostics for my site and trying to fix the errors that are showing up but Seomoz is producing some strange results.
It's saying pages are duplicated upto 16 times but those pages dont exist. It's adding "page=3", "page=4" to the end of the product URL but I don't see how it's finding those pages, nothing on the site(as far as I can tell) is linking to them. There is no "page=3", just the one product page.
Again on the duplicate content it's saying under the "other URLs" there's URLs like "http:///product-a" but again I don't see where it's finding these URLs and obviously those URL's dont work. Those three slashes aren't a typo either.
So far I've reduced the amount of errors from 2,005 to 543 but the rest of them I can't make sense of.
Also, what does one do when you have two products, eg: "product-a-white" and "product-a-black" to prevent Seomoz from seeing duplicates? Canonical links wont work because there's no parent item, just those two. Google Webmaster tools doesn't seem to have a problem though.
Using Opencart 1.5, if it helps.
Cheers,
-
Ah, so it may well be opencart doing something funky then. It's carrying the page url over into the product listing by the looks of it. I'll have to look into that then, thanks for pointing that out!
Do you have any idea how it could be finding the "http://maggie" style links?
Cheers for the help,
-
Ok, here is a example
http://www.lustrelingerie.com/Gracya-Lingerie/safari-wild-bra-push-up?page=5linked from
http://www.lustrelingerie.com/Gracya-Lingerie?page=5
Seems like if the pages= is on the catalog page, it is on the product links
-
Hi Alan, thanks for the response.
Yea, sure there's additional pages for the categories, I'm talking about the individual products.
Take http://www.lustrelingerie.com/Bassaya-Lingerie/camila-red for example. Seomoz's Diagnostics is saying there's a http://www.lustrelingerie.com/Bassaya-Lingerie/camila-red?page=2. The latter works if you go there, I don't understand that and that's likely down to opencart, but what I don't get is how Seomoz is finding the link to it.
And it's the same with links such as "http://maggie" (real error), I don't see where Seomoz is finding the links to those. I've checked any stray canonical links but they seem fine to me.
Thanks,
-
Yes they do exist
this page http://www.lustrelingerie.com/Everyday-Luxury-Underwear-Lingerie?page=1
is linked from this page
http://www.lustrelingerie.com/Everyday-Luxury-Underwear-Lingerie
There are many examples
-
The URL is http://www.lustrelingerie.com/
-
If you can give us a url i will tell you for sure
-
Hi Ben, thanks for the response.
The thing is I don't think it's a CMS issue, it seems to me that seomoz is getting confused somewhere. my product pages are along the lines of "www.domain.com/range/product-a/". They have a canonical link pointing to "www.domain.com/product-a/" And all only have a single page to them. Which is why I can't figure out where Seomoz is picking up these duplicates.
With regards to your latter paragraph, yea I was thinking that. I thought it might confuse customers though, or I was hoping there would be a more elegant solution. Going back in and editing 500+ products isn't something I was looking forward to hehe.
Cheers,
-
I'll speak to the duplicates issue since the other appears to be a CMS issue and how it is displaying the products. Whenever I see the "page=1" in the URL I can usually fink a pagination script that isn't helping my SEO efforts. But I don't know for sure in your situation, especially since you said you don't see any links on the product page.
As far as the "duplicates" issue. Try to get them as distinct as possible. With our product pages (starting with the most sold items) I have begun changing up the product name. We have the difference of only the height on many of our products so I'm having to get a little creative and add some other aspect to the URL that stays within the products title. I only want one page from my site competing for that exact match product SERP anyway. It's not a good idea to have two pages on your site competing for the same SERP. It seems to always be treated with less authority by Google when that happened in the past.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
AMP Session Stitching - How to deal with Google's Client ID AMP Policy
Hello, I recently attended SMX East and the concept of 'session stitching' for AMP was brought up (https://www.stonetemple.com/amp-tech-guide/). I reached out to my development team and they told me they could do it, but that we would need to agree to the new TOS changes and making users aware of then... https://support.google.com/analytics/answer/7486055 Has anyone here done something like this? And if so how did you deal with the Google's Client ID AMP policy? Thank you all! -Margarita
Reporting & Analytics | | MargaritaS1 -
Pageview/Goal Data Errors In GA
**Background: ** We utilize a static .html page for our quote form. It is embedded on our WordPress site via iframe in a single location. The quote form code itself (within the quote form .html) is generated from our CRM, but contains no tracking code itself. The .html containing this code is tracked with embedded Analytics code to track our Goals. This code is tested and works properly, recording goal completions when our thank-you.html page is loaded within the iframe. To be clear, quote.html is the page the iframe loads, .com/quote-page is the WordPress page with the iframe, and thanks.html is the goal completion page. Google Analyticator plugin handles code insertion throughout the site. The .html pages have code manually inserted and neither are indexed by Google or linked to/accessible by any route other than .com/quote-page **Problems: ** 1. When I check Pageviews in GA, the quote.html page has many more hits than .com/quote-page. The disparity is 552 to 416. How is this possible when quote-page has to be loaded in order for quote.html to be loaded? Shouldn't they be similar? 2. Our completion page, thanks.html, is showing 142 pageviews and 133 unique pageviews. Our goals confirm 133 goal conversions. How are people seeing the thanks.html page again without it registering a goal? A backspace? Someone help me decipher this please! If you need any more details, let me know!
Reporting & Analytics | | kirmeliux0 -
Canonical Tags & GWT Parameters
A site I'm working on has canonical tags which I find to be accurate, regardless of tracking parameters or anything else added to the url. The tag looks like: And we have alot of parameters in Google Search Console that look like Parameter Crawl page Let Googlebot Decide destination Let Googlebot Decide filters Let Googlebot Decide Since all of our parameters follow a question mark, like http://www.examplesite.com/questions/avocados?source=ad12345 and all of our pages have canonical tags showing the representative url without the additional parameters, why wouldn't we just have the one parameter in GWT as Parameter Crawl ? Representative URL I ask because I find that Google analytics shows pages with parameters as landing pages in search, which has me concerned about Google seeing it as duplicate content. Thanks! Best... Darcy
Reporting & Analytics | | 945010 -
New website server code errors
I launched a new website at www.cheaptubes.com and have recovered my search engine rankings as well after penguin & panda devestation. I'm continuing to improve the site but moz analytics is saying I have 288 medium issues and i see the warning "45% of site pages served 302 redirects during the last crawl". I'm not sure how to fix this. I'm on WP using Yoast SEO so all the 301's I did are 301's not 302's. I do have SSL, could it be Http vs Https?
Reporting & Analytics | | cheaptubes0 -
Get a list of robots.txt blocked URL and tell Google to crawl and index it.
Some of my key pages got blocked by robots.txt file and I have made required changes in robots.txt file but how can I get the blocked URL's list. My webmaster page Health>blocked URL's shows only number not the blocked URL's.My first question is from where can I fetch these blocked URL's and how can I get them back in searches, One other interesting point I see is that blocked pages are still showing up in searches.Title is appearing fine but Description shows blocked by robots.txt file. I need urgent recommendation as I do not want to see drop in my traffic any more.
Reporting & Analytics | | csfarnsworth0 -
Google Analytics & Omniture Discrepancies
I am seeing a significant difference between my traffic numbers in Google Analytics and Omniture (Omniture has significantly more). I do not expect them to report exactly the same numbers but these are just too far off. Any idea why that is, or which one I should trust more? Thanks!
Reporting & Analytics | | emediaSEO0 -
Re-running Crawl Diagnostics
I have made a bunch of changes thanks to the Crawl Diagnostics Tool but now need to re-run as I have lost where I started and what still needs to be done. How do I re-run the crawl diagnostic tool?
Reporting & Analytics | | Professor1 -
Google Analytics: Difference Between Goal Conversions & Goal Completions
When using Google Analytics, what is the difference between total goal conversions and total goal completions? We have many goals set up in a lead generation environment. Therefore, the only element of conversion is submitted a lead and arriving on the "Thank You" page. THose thank you pages are tagged accordingly. When we run reports though, the number of "Total Goal Conversions" and "Total Goal Completions" never match up.
Reporting & Analytics | | eMagineSEO0