Duplicate Content/Missing Meta Description | Pages DO NOT EXISIT!
-
Hello all,
For the last few months, Moz has been showing us that our site has roughly 2,000 duplicate content errors. Pages that were actually duplicate content, I took care of accordingly using best practice (301 redirects, canonicalization,etc.). Still remaining after these fixes were errors showing for pages that we have never created.
Our homepage is www.primepay.com. An example of pages that are being shown as duplicate content is http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/payroll/online-payroll with a referring page of http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/online-payroll. Some of these are even now showing up as 403 and 404 errors.
The only real page on our site within that URL strand is primepay.com/payroll or primepay.com/payroll/online-payroll. Therefore, I am not sure where Moz is getting these pages from.
Another issue we are having in relation to duplicate content is that moz is showing old campaign url’s tacked on to our blog page i.e. http://primepay.com/blog?title=&page=2&utm_source=blog&utm_medium=blogCTA&utm_campaign=IRSblogpost&qt-blog_tabs=1.
As of this morning, our duplicate content went from 2,000 to 18,000. I exported all of our crawl diagnostics data and looked to see what the referring pages were, and even they are not pages that we have created. When you click on these links, they take you to a random point in time from the homepage of our blog; some dating back to 2010.
I checked our crawl stats in both Google and Bing’s Webmaster tool, and there are no duplicate content or 400 level errors being reporting from their crawl. My team is truly at a loss with trying to resolve this issue and any help with this matter would be greatly appreciated.
-
Thanks Dirk. Very insightful tip about not using campaign tracking to check internal links. There was an old blog post that had anchor text with campaign tracking that was causing many SEO issues. As for the latter part, it is unknown why a string of gibberish can be placed after /blog/ and also for our locations page. Our team's web developer is looking further into this issue. If anyone has any more advice on the matter it would be greatly appreciated.
-
Hey there
Dirk pretty much hit upon the issue, which I'll reiterate with a visual. If you enter any gibberish /blog URL (like this: http://primepay.com/blog/jglkjglkjg) in the browser it returns a 200 OK which, but it should return a 404 code --> http://screencast.com/t/cStpPB5zE
Otherwise pages that are really broken will look to crawlers like they are supposed to exist.
-
You shouldn't use campaign tracking to check internal links - you have to use event tracking. Check http://cutroni.com/blog/2010/03/30/tracking-internal-campaigns-with-google-analytics/ . Apart from the reporting issue - it's also generating a huge number of url's that need to be crawled by Google bot and is just wasting it's time (most of these tagged url have a correct canonical version). You mention these tags are old - but they are still present on a lost of pages.
For cases like this it's better to check with a local tool like Screaming Frog which gives you a much better view which pages are generating these links.The other issue you have is probably related to a few pages that have a bad formatted (relative) url in a link - the way your site is configured it's just rendering a page on your site - so the bots are then crawling your site over and over again, each time encountering the same bad relative link - and each time adding the bad formatting to the url. It's an endless loop - best way to avoid this is to use absolute internal links rather than relative links. Not sure if it's the only one - but one of the pages with this error is :http://primepay.com/blog/7-ways-find-right-payroll-service-your-company - it contains a link to
[Your payroll service is no different.]([Link to - http://www.primepay.com/en/payrollservices/] "Your payroll service is no different.")
This page should generate a 404 but is generating a 200 and the loop starts here.
Again - with screaming frog you can for each of these bad url's you can generate a crawl path report which shows you exactly on which page the error is generated.
Hope this helps,
Dirk
-
Example:
http://primepay.com/blog/hgehergreg
Status:
My site as an example:
https://caseo.ca/blog/hgehergreg
If I put in random gibberish in this URL, it should be displaying a 404 page and not the blog page.
-
Getting you some help for direct advice on your problem, but wanted to leave a comment about the tool itself. When you are looking at the Moz crawl tool, it only updates once a week, so if there hasn't been that long between the last crawl and when you did the work, it won't be updated. Here's more info.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I fix duplicate title issues?
I have a sub domain that isn't even on our own site but it's resulting in a lot of errors in Moz for duplicate content, as shown here: http://cl.ly/1R081v0K0e2N. Would this affect our ranking or is it simply just errors within Moz? What measures could I take to make sure that Moz or Google doesn't associate our site with these errors? Would I have to noindex in the htaccess file for the sub domain?
Moz Pro | | MMAffiliate0 -
Duplicate Content errors - not going away with canonical
I am getting Duplicate Content Errors reported by Moz on search result pages due to parameters. I went through the document on resolving Duplicate Content errors and implemented the canonical solution to resolve it. The canonical in the header has been in place for a few weeks now and Moz is still showing the pages as Duplicate Content despite the canonical reference. Is this a Moz bug? http://mathematica-mpr.com/news/?facet={81C018ED-CEB9-477D-AFCC-1E6989A1D6CF}
Moz Pro | | jpfleiderer0 -
Home Page Location Redirect
We have recently upgraded our Wordpress site to detect your local city and redirect to the proper location. Previously we had independent sites - for example, http://atlanta.styleblueprint.com is now http://styleblueprint.com/atlanta We've setup 301 redirects on all of the old site home pages. Now we have two issues: Moz will no longer crawl our domain. For two weeks now our campaign shows only four pages crawled None of our home pages show up in Google any longer for organic searches. We previously always ranked #1 for "styleblueprint" or "style blueprint" Does our new auto redirect mess things up? Or is this just a function of time until Google "learns" how to index our new site? All thoughts appreciated. Thanks in advance, Jay
Moz Pro | | SSBCI0 -
On Link Analysis tab I my best pages are 301 and 404 pages.
I looked on my redirrect file and found that /* redirects to /v/404.asp.
Moz Pro | | sbetzen
However if you look below at the link analysis the 404 page is getting a 404 error.
The homepage ecowindchimes.com/ is getting a 301 (but I don't know where it is going to).
The third one is also redirected. 1. [No Data] ecowindchimes.com/ ||| 301 ||| 2 ||| 36 2. 2. [No Data] ecowindchimes.com/v/404.asp ||| 404 ||| 2 ||| 34 3. [No Data] 3. ecowindchimes.com/index.html?lang=en-us&target=d2.html ||| 301 ||| 1 ||| 33 So I have 2 questions: 1) should this be fixed? and 2) how? This is a volusion site and I believe the "catchall" redirect was done by them0 -
Seomoz duplicate rel="next" pages
Hello my page has this Although with seomoz crawl it says that this pages has duplicate titles. If my blog has 25 pages, i have according seomoz 25 duplicate titles. Can someone tell me if this is correct or if the seomoz crawl cannot recognize rel="next" or if there is another better way to tell google when there a pages generated from the blog that as the same title Should i ignore these seomoz errors thank you,
Moz Pro | | maestrosonrisas0 -
How do I scan down to 10000 pages?
Hi very new here I have set up 5 campaigns, all of fairly large sites. It appears seomoz has scanned 4 of them down to 250 and 1 down to 10000. the one a really want to see down to 10000, my own site is the one I started scanning first well over a week ago. How do I get seomoz to scan further? Thanks
Moz Pro | | First-VehicleLeasing0 -
Seomoz & Duplicate Page Content Issue?
Hi, What is the criteria on Seomoz Crawl Diagnostic Report? I got a long list of URLs indicating Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings. But as I gone through none of the reported pages duplicate. What should I do? Thanks in Advance
Moz Pro | | VipinLouka780 -
On-Page Keyword Optimization Question
First let me say I want to improve the text of the site I am working on focusing on the site visitor in the first instance. I run the "On-Page Keyword Optimization" The page fails on "Avoid Keyword Stuffing in Document... ...Occurrences of Keyword 48" well over the limit of 15. The occurrence include those in the site navigation and strapline, but it was my understanding that Google was aware of nav areas/areas common to most other pages on the site and that keywords in these areas weren't viewed as being part of the page content. The keyword is the main keyword for the company, and the page is the home page i.e. "acme widgets" the others are "acme widgets for the home"... well you get the idea: The page breaks down as follows: 5 instances in primary nav 1 instance strapline 3 instances secondary nav Remainder in page body I am told by the tool to reduce to 15 instances, so should I? Have 9 instances in the nav and other areas and 6 or so on the page Have 9 instances in the nav and other areas and 15 or so on the page
Moz Pro | | GrouchyKids0