Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL change - Sitemap update / redirect
Hi everyone Recently we performed a massive, hybrid site migration (CMS, URL, site structure change) without losing any traffic (yay!). Today I am finding out that our developers+copy writers decided to change Some URLs (pages are the same) without notifying anyone (I'm not going into details why). Anyhow, some URLs in site map changed, so old URLs don't exist anymore. Here is the example: OLD (in sitemap, indexed): https://www.domain.com/destinations/massachusetts/dennis-port NEW: https://www.domain.com/destinations/massachusetts/cape-cod Also, you should know that there is a number of redirects that happened in the past (whole site) Example : Last couple years redirections: HTTP to HTTPS non-www to www trailing slash to no trailing slash Most recent (a month ago ) Site Migration Redirects (URLs / site structure change) So I could add new URLs to the sitemap and resubmit in GSC. My dilemma is what to do with old URL? So we already have a ton of redirects and adding another one is not something I'm in favor of because of redirect loops and issues that can affect our SEO efforts. I would suggest to change the original, most recent 301 redirects and point to the new URL ( pre-migration 301 redirect to newly created URL). The goal is not to send mixed signals to SEs and not to lose visibility. Any advice? Please let me know if you need more clarification. Thank you
Intermediate & Advanced SEO | | bgvsiteadmin0 -
Why do people put xml sitemaps in subfolders? Why not just the root? What's the best solution?
Just read this: "The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/." here: http://www.sitemaps.org/protocol.html#location Yet surely it's better to put the sitemaps at the root so you have:
Intermediate & Advanced SEO | | McTaggart
(a) http://example.com/sitemap.xml
http://example.com/sitemap-chocolatecakes.xml
http://example.com/sitemap-spongecakes.xml
and so on... OR this kind of approach -
(b) http://example/com/sitemap.xml
http://example.com/sitemap/chocolatecakes.xml and
http://example.com/sitemap/spongecakes.xml I would tend towards (a) rather than (b) - which is the best option? Also, can I keep the structure the same for sitemaps that are subcategories of other sitemaps - for example - for a subcategory of http://example.com/sitemap-chocolatecakes.xml I might create http://example.com/sitemap-chocolatecakes-cherryicing.xml - or should I add a sub folder to turn it into http://example.com/sitemap-chocolatecakes/cherryicing.xml Look forward to reading your comments - Luke0 -
Multiple Domain Names Pointing to One URL
Hi! A company I work with has purchased several (70-something) domain names that are relevant to their business. According to their IT pro, they're currently using DNS to point those domains to our IP address, with a catch-all header on IIS for that IP address. Essentially, we have 70-something domain names that direct to the homepage. I noticed that some have been indexed by Google and are pulling in the meta of the homepage they're being directed to. Is this potentially an issue? If so, would 301 redirects fix this or are we okay with the status quo and the indexing is no big deal? Thanks in advance!
Intermediate & Advanced SEO | | 199580 -
Brackets vs Encoded URLs: The "Same" in Google's eyes, or dup content?
Hello, This is the first time I've asked a question here, but I would really appreciate the advice of the community - thank you, thank you! Scenario: Internal linking is pointing to two different versions of a URL, one with brackets [] and the other version with the brackets encoded as %5B%5D Version 1: http://www.site.com/test?hello**[]=all&howdy[]=all&ciao[]=all
Intermediate & Advanced SEO | | mirabile
Version 2: http://www.site.com/test?hello%5B%5D**=all&howdy**%5B%5D**=all&ciao**%5B%5D**=all Question: Will search engines view these as duplicate content? Technically there is a difference in characters, but it's only because one version encodes the brackets, and the other does not (See: http://www.w3schools.com/tags/ref_urlencode.asp) We are asking the developer to encode ALL URLs because this seems cleaner but they are telling us that Google will see zero difference. We aren't sure if this is true, since engines can get so _hung up on even one single difference in character. _ We don't want to unnecessarily fracture the internal link structure of the site, so again - any feedback is welcome, thank you. 🙂0 -
What's the news on sitwide nofollow links and anchor text penalties
Is it possible to be penalized for sitewide nofollow links because of anchor text penalties, even if you use branded anchor text?
Intermediate & Advanced SEO | | BobGW0 -
Google fluctuates its result on Chrome's private browsing
I have seen an interesting Google behaviour this morning. As usual, I would open Chrome's private browsing to see how a keyword is ranking. This was what I see... Typed in "sell my car", I see Auto Trader page on 3rd. (Ref:Sell My Car 1st result img) Googled something else, then re-Googled "sell my car" and saw that our page went to 2nd! I repeated the same process and saw that we went from 3rd to 2nd again. Has Google results gone mental? PaGXJ.png
Intermediate & Advanced SEO | | tmg.seo0 -
Our site has been penalized and it's proving to be very hard to get our rankings back...
So I have a question. We have used nearly every trick in the book to rank our site, including a ton of white hat stuff.... but then also a lot of black hat practices that resulted in us dropping in the rankings by about 30-40 positions. And getting back to where we were (top 10 for most keywords) is proving to be nearly impossible. We have a ton of great content coming off of the site and we actually offer a quality product. We follow most of the guidelines advocated here on SEOmoz. But the black hat stuff we did has really taken a toll. And it's gonna be pretty much impossible to go back in time and erase all of the Black Hat stuff we did. So what should we do? Should we design a completely new website with a new domain? What can be done to help?
Intermediate & Advanced SEO | | LilyRay0 -
New gTLD's, buy or wait and see?
Is the new gTLD scheme from ICANN worth the money? I manage a brand relatively well-known in our own market segment. Would I benefit from moving from .com and national TLDs for my international sites to my own brand TLD? Are there any obvious SEO pros and cons?
Intermediate & Advanced SEO | | KnutDSvendsen0