How do I fix apparent duplicates
-
I'm auditing a site and would appreciate your help with possible explanations and solutions as to why Google Analytics in the Content Drilldown page is showing what appears to be duplicate pages. (Refer image)
I'm wondering if I have got my head around the rel=canonical tag because the page I'd consider a duplicate "page/" has a Canonical tag pointing to "~/page.html"
This is the tag from the page Locations/
rel="canonical" href="http://www.domain.com/Locations.html" /> so am unsure why both versions of the page are generating views. Shouldn't the Canonical tag work like a 301 redirect?
I'm unsure how the pages using the path page/ are generating so many views because I have not been able to find them and they are not indexed by Google.
Unfortunately the site is built using a Propriety CMS I'm not familiar with.
-
Hi Paul
I appreciate your explanation of when to use Canonical tags. I had previously thought they were limited to redirecting www.domain.com to domain.com.
I understand your solution to the Dupes problem and will be searching SEOMoz's resources for how to write rewrites and Search & Replace filters using RegEx in Analytics for that matter.
It's not the first time you've provided an high quality answer to a question of mine. I very much appreciate your contribution to my growing knowledge and the SEOMoz community.
Best
Nic
-
A canonical tag is fundamentally different from a 301-redirect, Nic. There's nothing about a canonical tag that stops a visitor from being able to visit that URL. A 301-redirect actually forwards the visitor to the target page as if the initial page doesn't even exist so there's no physical way for a visitor to land on it.
Put another way, the source page of a 301-redirected URL doesn't even exist as far as the search engines are concerned (and eventually the'll actually drop the original URL altogether).
The canonical tag serves a very specific purpose. When two pages must continue to be reachable by 2 different URLs but the page content is essentially identical (e.g. a product page sorted by size or colour), then a canonical tag suggests that the search engines should consolidate the ranking value in the primary URL. That's it.
In the case of the /contact+us.html and /contact+us/ pages - that page should only be reachable at one or the other URL. There's no reason or value to the user for the page to be reachable at the second address. The correct way to deal with this is to use a rewrite rule to 301-redirect all the page/ versions of the site's pages to the page.html (assuming that's what you've decided should be the canonical.
The only time to use canonical tags instead of redirects in a case like this is if it is technically impossible to implement the rewrites (a shared server that doesn't allow access to the .htaccess file for example). But this is sub-optimal and would still leave you with the same Analytics dupe page problem you're currently running into.
So what to do about the dupes in Analytics, given the site wasn't configured with the rewrites? You can write a custom Search and Replace filter for the site's profile that uses regex to merge both versions of each page into a single line. You'll absolutely want to do this in a new profile created just for this purpose though, keeping the original unfiltered profile for reference and historical data.
Note that this will only affect data collected from the date of creation of the new profile/filter. It's not retroactive. If you want to combine results for these pages for the existing data, you'll need to dump it to Excel and use a formula to combine the dupes.
Hope that all makes sense?
Paul
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have a WP site which uses categories to display the same content in several locations. Which items should get a canonical tag to avoid a ding for duplicate content?
So...I have a Knowledge Center and press room that pretty much use the same posts. So...technically the content looks like its on several pages because the post shows up on the Category listing page. Do I add a Canonical tag to each individual post...so that it is the only one that is counted? Also...I have a LONG disclaimer that goes at the bottom of most of the posts. would this count as duplicate content? Is there a way to markup a single paragraph to tell the spiders not to crawl it?
Reporting & Analytics | | LindsayiHart0 -
How do I fix 608's please?
Hi, I'm on the free trial and finding it very useful I've fixed all my 301's. but now I have a load of 608's. I don't no what this is! I feel like I've cured herpes only to get gonorrhea! can any one help. I have 41 608's which is more than the 301's I had. I hope they are non-related! I won't bore you with the whole list but some of the url's are: Error Code 608: Page not Decodable as Specified Content Encoding http://sussexchef.com/catering-at-mr-mrs-currys-50th-wedding-anniversary/guestsarrive----608 Error Code 608: Page not Decodable as Specified Content Encoding http://sussexchef.com/funeral-catering/picture4-2----608 Error Code 608: Page not Decodable as Specified Content Encoding http://sussexchef.com/wedding-venues
Reporting & Analytics | | SussexChef831 -
Duplicate page content
I'm seeing duplicate page content for tagged URLs. For example:
Reporting & Analytics | | DolbySEO
http://www.dolby.com/us/en/about-us/careers/landing.html
http://www.dolby.com/us/en/about-us/careers/landing.html?onlnk=al-sc as well as PPC campaigns. We tag certain landing pages purposefully in order to understand that traffic comes from these pages, since we use Google Analytics and don't have the abiility to see clickpaths in the package we have. Is there a way to set parameters for crawling to exclude certain pages or tagged content, such as those set up for PPC campaigns?0 -
Duplicate Content
I am looking to check the duplicate content of two websites against each other, www.housesalesbulgaria.com and www.housesalesturkey.com. What is the best way to check this?
Reporting & Analytics | | Feily0 -
Duplicate content warnings
I have a ton of duplicate content warnings for my site poker-coaching.net, but I can't see where there are duplicate URLs. I cannot find any function where I could check the original URL vs a list of other URLs where the duplicate content is?
Reporting & Analytics | | CatfishTPA0 -
Duplicate Url with Google shopping feed
In webmaster tool I have many duplicate url tagged as google_shopping Obviously i'm tagging the url with the goog url builder Url: elettrodomestici.yeppon.it/cura-corpo/tagliacapelli/remington-tagliacapelli-funzionamento-rete-ricaricabile-lame-in-acciaio-inox-hc5150-garanzia/ Duplicate url: elettrodomestici.yeppon.it/cura-corpo/tagliacapelli/remington-tagliacapelli-funzionamento-rete-ricaricabile-lame-in-acciaio-inox-hc5150-garanzia/?utm_source=google_shopping&utm_medium=web&utm_content=Elettrodomestici+e+Clima+%3E+Cura+del+corpo+%3E+Tagliacapelli&utm_campaign=google_shopping How can I solve it? Thanks
Reporting & Analytics | | yeppon0 -
Time until duplicate penalty is lifted?
Hello, I recently discovered that half of the pages on my site, about 3,500 were not being indexed or were indexing very very slow and with a heavy weight on them. I discovered the problem in the "HTML Suggestions" within Google's Webmaster Tools. An example of my main issue. All 3 of these URL were showing 200 Status OK in Google. www.getrightmusic.com/mixtape/post/ludacris_1_21_gigawatts_back_to_the_first_time www.getrightmusic.com/mixtape/post/ludacris_1_21_gigawatts_back_to_the_first_time/ www.getrightmusic.com/mixtape/ludacris_1_21_gigawatts_back_to_the_first_time I added some code to the .htaccess in order to remove the trailing slashes across the board. I also properly set up my 404 redirects, which were not properly set up by my developer (when the site "relaunched" 6 months ago 😞 ) I then added the Canonical link rel tags on the site posts/entries. I'm hoping I followed all the correct steps in fixing the issue and now, I guess, I just have to wait until the penalty gets lifted? I'm also not %100 certain that I have been penalized. I'm just assuming based on the SERP ceiling I feel and the super slow or lack of indexing my content. Any insight, help or comments would be super helpful. Thank you. Jesse
Reporting & Analytics | | getrightmusic0 -
The brainstorm of finding the reason of the URL decrease on original search result and the procedures of fixing the problem?
Hi guys: i just any one have some idea of how to find the mainly reasons of the listed position on google search original result decrease and the procedures of fixing those problem. Appreciate for any feedback. David
Reporting & Analytics | | skyten0