How does a sitemap affect the definition of canonical URLs?
-
We are having some difficulty generating a sitemap that includes our SEO-friendly URLs (the ones we want to set as canonical), and I was wondering if we might be able to simply use the non-SEO-friendly, non-canonical URLs that the sitemap generator has been producing and then use 301 redirects to send them to the canonical. Is there a reason why we should not be doing this? We don't want search engines to think that the sitemap URLs are more important than the pages to which they redirect.
How important is it that the sitemap URLs match the canonical URLs? We would like to find a solution outside of the generation of the sitemap itself as we are locked into using a vendor’s product in order to generate the sitemap.
Thanks!
-
Thank you for your responses.
We use Endeca, but while they have a site map generator, for whatever reason they are unable to produce URLs that match our new SEO-friendly vanity URLs. Right now we've had no site map for months, as we're waiting to try and find a solution to this problem.
From what I'm gathering, this is the right approach? As in, it would do more harm than good to upload a "bad" sitemap. Yes?
Also, there seems to be no way to get around this with a clever redirect scheme. Am I right in this also?
In which case, it may boil down to choosing between an accurate sitemap and SEO'd URLs. Not sure which would be more important.
Website's here, if that's useful: www.pli.edu
-
Bing has said that anything over 1% of bad URLs in a sitemap constitutes a dirty sitemap to them, so yes, it is very important.
Are you able to share the system that you're using? Others may have experience in working around this already.
-
It's extremely important the sitemap URLs match the canonical URLs that people arrive at. If they do not match the search engine will consider the sitemap "dirty" and not valuable as it is not accurate to the actual layout of the website.
Essentially, the search engines consider a sitemap URL that does not return an HTTP 200 status a bad URL and reject the sitemap. This is absolutely something that you should work to correct.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap: Linking horizontal pages on a sitemap that has a vertical hierarchy structure
I'm currently in the process of revamping a website and creating a sitemap for it so that all pages get indexed by search engines. The site is divided into two websites that share the same root domain. The marketing site is on example.com and the application is on go.example.com. To get to go.example.com from example.com, you need to go through one of three “action pages”. The action pages are accessed from every page on example.com where we have a CTA button on the site (that’s pretty much every page). These action pages do not link back to any other page on the site though, nor are they a necessary step to navigate to other webpages. These action pages are only viewed when a user is ready to be taken to the application site. My question is, how should these pages be set up in a vertical sitemap since these three pages have a horizontal structure? Any insight would be much appreciated!
Technical SEO | | RallyUp0 -
Should I change the URL now?
Hi all, I have a client website that got hit in the latest algorithm update. It since appears that it had over 100 suspect links to it. I performed the Disavow procedure a few weeks ago via my Google Webmaster account, but have not received a message yet to say its been actioned. The majority of these suspect links go to one page. I am considering changing the base category (in Wordpress) to a different keyphrase and then submitting a new sitemap for indexing. This way there will be no actual link from a suspect website to a page on my website. Do you see what I mean? Will this help do you think? Thanks in advance.
Technical SEO | | BrandC0 -
Question about construction of our sitemap URL in robots.txt file
Hi all, This is a Webmaster/SEO question. This is the sitemap URL currently in our robots.txt file: http://www.ccisolutions.com/sitemap.xml As you can see it leads to a page with two URLs on it. Is this a problem? Wouldn't it be better to list both of those XML files as separate line items in the robots.txt file? Thanks! Dana
Technical SEO | | danatanseo0 -
Issues with trailing slash url
Recently, we have changed our website to www.example.com/super-rentals/ (example) and we have done a 301 redirection to the new urls from the old one. We have noticed in Google webmaster tool that urls without trailing slash as 404 error. www.example.com/super-rentals. Please let us know how to fix this issue as soon as possible. Note: Our previous urls are not the urls without trailing slash. It is a different url (www.example.com/super-rentals.htm) we have rewritten in to www.example.com/super-rentals/ only. I would like to know why GWT pulls out the urls without trailing slash and shows in 404 error. Thanks for your time
Technical SEO | | massimobrogi0 -
Is it OK for a sitemap to appear as a "Top URL" in Google Webmaster?
I'm using Google Webmaster (alongside other tools) to understand how Google is indexing my site. One of the tools is "Content Keywords", where it lists keywords that Google sees as significant for your site. The keywords shown are generally fine, but when I click on an individual word, I am often seeing our sitemap as one of the "Top URLs" that the keyword is found on (our sitemap is at system/sitemap1.xml.gz) - is this OK? Obviously I don't want to add the sitemap URL to robots.txt, but I also want to ensure that 'real' user-focused pages (e.g. our homepage) appear higher in the "Top URLs" list for the keywords, as I'm assuming this is an indicator of how the site is performing in search. Any help appreciated!
Technical SEO | | anilababla0 -
Rel=Canonical being ignored?
Hi all, We have a toys website that has several categories. It's setup such that each product has a primary category amongst the categories within it can be found. For example... Addendum's primary url is http://www.brightminds.co.uk/childrens-toys/board-games/addendum.htm but it can also be found here http://www.brightminds.co.uk/learning-toys/maths-learning/addendum.htm. Hence, in the for that url it has a rel=canonical that points to the first url. For some reason though seomoz ignores this and reports duplicate page content. It doesn't seem to record the canonical tag either. Any ideas what's going on? Thanks, Josh.
Technical SEO | | joshgeake_gmail.com0 -
Canonical Tag
Does it do anything to place the Canonical tag on the unique page itself? I thought this was only to be used on the offending pages that are the copies. Thanks
Technical SEO | | poolguy0 -
Re-write of url
Hi, I would like your input on the following dilemma I am wanting to target the keyword "download xml". at the moment Google indexes us on page 2 and indexes the page www.ourdomain.com/download.aspx I would like to rewrite the url to be /download-xml-editor.aspx The current page is a pr5 and is our most trafficked and externally inked to page. My thoughts are quite mixed on how to do this. approach 1: re-write url of "download.aspx" and setup permanent 301 redirect of download.aspx to download-xml-editor.aspx approach 2: create a new page called download-xml-editor and 301 redirect that to the current stronger page which is download.aspx approach 3: create new page called download-xml-editor with unique content and try and get that page to rank over time, allowing it to build up links and not compromise the current page, then later 301 redirect How would you deal with this and what are your recommendations
Technical SEO | | LiquidTech0