Rel=canonical + no index
-
We have been doing an a/b test of our hp and although we placed a rel=canonical tag on the testing page it is still being indexed. In fact at one point google even had it showing as a sitelink . We have this problem through out our website. My question is:
What is the best practice for duplicate pages?
1. put only a rel= canonical pointing to the "wanted original page"
2. put a rel= canonical (pointing to the wanted original page) and a no index on the duplicate version
Has anyone seen any detrimental effect doing # 2?
Thanks
-
Interesting - I've very rarely had issues with GWO, but if a new URL was created and someone linked to it, I can see where you might have a problem.
(1) None of these things are absolute, I'm afraid, but typically, yes - a rel=canonical to a different page should keep the first page out of the index.
(2) Usually, but it depends. The problem here may be that Google just isn't crawling the test variant very often, so they may not be processing the rel=canonical yet.
If it's just a couple of pages, I'd give it time - it's probably not an emergency situation. Again, you could just tell Google to remove them in GWT. I think you're doing the right thing with the canonical tags, but it can take Google time to process them the way you want to, in practice.
-
To answer the second question :
We actually use google's website optimizer to run our test -- the problem started when someone linked to the test page....
Not sure if these scenarios are different for google -- but just trying to understand it
1. if a page was never indexed before and you put a rel= canonical on it (pointing to a different page) than the rel = canonical will keep it out of the index?
2. If a page was already in the index and you put on rel=canonical is that a strong enough signal for google to go and remove it from the index?
obviously both these scenarios are once the pages have been crawled
-
I wouldn't mix those signals - it's nearly impossible to tell what's working if you do. If the canonical on the test page isn't working, there may be a couple of issues:
(1) It could just be taking time. Honestly, it's never as fast as you want it to be.
(2) It may be that the test versions got crawled originally, but now aren't being crawled (on the canonical isn't being processed). Check the cache date on the test page.
The big question is how they got crawled in the first place. It's often better to use some sort of cookie-based implementation so that Google never even sees the B version. That's how most of the A/B test implementations work (specifically to avoid this problem).
If it's just a couple of URLs and you can't shake them, you could request manual removal in GWT. That really depends on the scope and URL structure, though.
-
Good point, i was thinking of robots.txt, where the page would not eb read.
But I have not thought about that situation. i am not sure what search engines would do.
But still, just the canonical is needed.
-
A page that has a no index on it still gets crawled and therefore the rel=canonical directive is still "seen" by the bot --- so why wouldn't the rel=canonical pass the credit over?
-
Just the rel canonical
if you no index the page, the rel canonical can not be indexed and can not work
Rel canonical simply passes the credit for the content to the canonical page.
no index is like cutting off your hand because you have a splinter. links pointing to a non indexed page are puring link juice into thin air.
You can use a mete noindex , follow so that some of the link juice is returned, but canonical is best for duplicate content.
Actualy getting rid of the duplicate content is best
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Not all images indexed in Google
Hi all, Recently, got an unusual issue with images in Google index. We have more than 1,500 images in our sitemap, but according to Search Console only 273 of those are indexed. If I check Google image search directly, I find more images in index, but still not all of them. For example this post has 28 images and only 17 are indexed in Google image. This is happening to other posts as well. Checked all possible reasons (missing alt, image as background, file size, fetch and render in Search Console), but none of these are relevant in our case. So, everything looks fine, but not all images are in index. Any ideas on this issue? Your feedback is much appreciated, thanks
Technical SEO | | flo_seo1 -
No index
Screaming frog spider does index pages on our website like: wp-content/plugins/woocommerce/assets/js/frontend/jquery-ui-touch-punch.min.js?ver=2.3.9 wp-content/plugins/mailchimp-for-wp/assets/css/checkbox.min.css?ver=2.3.2 Is it a bad/good idea to set my parameters in Webmastertools and tell Google not to crawl pages that begin with wp/content? Thanks!
Technical SEO | | Happy-SEO1 -
Does rel="canonical" support protocol relative URL?
I need to switch a site from http to https. We gonna add 301 redirect all over the board. I also use rel="canonical" to strip some queries parameter from the index (parameter uses to identify which navigation elements were use.) rel="canonical" can be used with relative or absolute links, but Google recommend using absolute links to minimize potential confusion or difficulties. So here my question, did you see any issue using relative protocol in rel="canonical"? Instead of:
Technical SEO | | EquipeWeb0 -
Carwling and indexing problems
hi, i have noticed since my site was upgraded that google is taking a long time to publish my articles. before the upgrade google would publish the article straight away, but now it takes an average of around 4 days. the article i am talking about at the moment is here http://www.in2town.co.uk/celebrities-in-the-news/stuart-hall-has-his-prison-sentence-for-sex-crimes-doubled-to-30-months now i have a blog here on blogger and the article was picked up within six mins http://showbizgossipandnews.blogspot.co.uk/2013/07/stuart-hall-has-his-prison-sentence-for.html so i am just wondering what the problem is and what i need to solve this my problem is, my site is mostly a news site so it is no good to me if google is publishing new stories every four days, any help would be great.
Technical SEO | | ClaireH-1848860 -
Canonical Issue?
Hi, I was using the On Page Report Card Tool here on SEOMOZ for the following page: http://www.priceline.com/eventi-a-kimpton-hotel-new-york-city-new-york-ny-1614979-hd.hotel-reviews-hotel-guides and it claims there is a canonical issue or improper use of it. I looked at the element and it seems to be fine: <link rel="canonical" href="http://www.priceline.com/eventi-a-kimpton-hotel-new-york-city-new-york-ny-1614979-hd.hotel-reviews-hotel-guides" /> Can you spot the issue and how it would be fixed? Thanks. Eddy
Technical SEO | | workathomecareers0 -
Similar pages: noindex or rel:canonical or disregard parameters?!
Hey all! We have a hotel booking website that has search results pages per destinations (e.g. hotels in NYC is dayguest.com/nyc). Pages are also generated for destinations depending on various parameters, that can be star rating, amenities, style of the properties, etc. (e.g. dayguest.com/nyc/4stars, dayguest.com/nyc/luggagestorage, dayguest.com/nyc/luxury, etc.). In general, all of these pages are very similar, as for example, there might be 10 hotels in NYC and all of them will offer luggage storage. Pages can be nearly identical. Come the problems of duplicate content and loss of juice by dilution. I was wondering what was the best practice in such a situation: should I just put all pages except the most important ones (e.g. dayguest.com/nyc) as noindex? Or set it as canonical page for all variations? Or in google webmaster tool ask google to disregard the URLs for various parameters? Or do something else altogether?! Thanks for the help!
Technical SEO | | Philoups0 -
Instant Indexing
I've been working on a site for a while now, methodically building content and building trust and authority. Lately I've noticed that anything I publish there appears to be instantly indexed by Google, which surprises me. I haven't had this happen before so I'm curious. I'd be interested to hear the experience of others.
Technical SEO | | waynekolenchuk0 -
IP addresses indexed?
I've met with a potential client who has a site with 1,000's of very specific part #'s which don't show in the SERP's on Google. They definitely have the issue of dynamic URL's - but the URL for the part # searches is an IP address rather than their domain name - example: 188.888.888.888/partssearch.php?pnum='1233445' I've not seen the IP address used like this for an external website - is this acceptable for SEO purposes? Thanks, Mark
Technical SEO | | DenverKelly0