Sitemap Rules
-
Hello there,
I have some questions pertaining to sitemaps that I would appreciate some guidance on.
1. Can an XML sitemap contain URLs that are blocked by robots.txt? Logically, it makes sense to me to not include pages blocked by robots.txt but would like some clarity on the matter i.e. will having pages blocked by robots.txt in a sitemap, negatively impact the benefit of a sitemap?
2. Can a XML sitemap include URLs from multiple subdomains? For example:
http://www.example.com/www-sitemap.xml would include the home page URL of two other subdomains i.e. http://blog.example.com/ & http://blog2.example.com/
Thanks
-
Theoretically, if the URL is blocked by robots.txt it should not appear in the index results no matter if they are in the sitemap but I have seen URLs indexed that are blocked by robots.txt but are in the sitemap and have good links pointing to it. If you want to block pages that have good links pointing to them, my advice is to remove them from sitemap. #justathought.
About URLs from multiple domains, I personally create separate sitemaps for different subdomains and link to main sitemap and I see better indexing that way.
Again, these are my personal experiences and not rules so please do keep that in mind as things can be different fro them.
-
Hey,
1.) Yes you can do this and it won't 'negativel impact it' but it might cause a couple of Search Console errors when you come to submit the URLs - blocking crawlers in the robots.txt file is a directive that instructs them not to crawl that particular page. With this being said, supplying them with a sitemap of all page locations will not mean that they crawl these pages, but it is an instruction to crawlers that these pages do exist. Personally, I would meta noindex these pages to make sure that they don't reach search engines as the blocking in the robots.txt file can often not be enough to prevent this, especially if you're also submitting a sitemap.
2.) In short, I don't think you can have a single XML sitemap containing URLs from multiple subdomains BUT you can have sitemaps for multiple subdomains hosted on the TLD individually. Google have broken this down really well in their Webmaster Tools post:
https://support.google.com/webmasters/answer/75712?hl=en&topic=8476&ctx=topic
Hope this helps!
Sean
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console Showing 404 errors for product pages not in sitemap?
We have some products with url changes over the past several months. Google is showing these as having 404 errors even though they are not in sitemap (sitemap shows the correct NEW url). Is this expected? Will these errors eventually go away/stop being monitored by Google?
Technical SEO | | woshea0 -
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
Sitemap submission for site migration?
Hi mozzers, We're about to migrate 4 domains into 1. Is there a particular way I should generate and submit the sitemap or should I just follow the same protocol as for one domain? Should I even worry submitting a sitemap when the site has this drupal module? I have access to the webmaster tools of all domains, should I do something specific on the accounts that are migrating besides submitting a sitemap? Thanks for letting me know!
Technical SEO | | Ideas-Money-Art0 -
Google Webmaster tools Sitemap submitted vs indexed vs Index Status
I'm having an odd error I'm trying to diagnose. Our Index Status is growing and is now up to 1,115. However when I look at Sitemaps we have 763 submitted but only 134 indexed. The submitted and indexed were virtually the same around 750 until 15 days ago when the indexed dipped dramatically. Additionally when I look under HTML improvements I only find 3 duplicate pages, and I ran screaming frog on the site and got similar results, low duplicates. Our actual content should be around 950 pages counting all the category pages. What's going on here?
Technical SEO | | K-WINTER0 -
Whats the best tool for a Sitemap creation?
Hi guys i like to know whats the best tool to create diferent types of Sitemap´s (images, videos, normals). I dont care if is paid.
Technical SEO | | faraujoj0 -
Xml Sitemap
Hi mozzers, I am about to submit a sitemap for one of my clients via webmaster tools. The issue is that I have way too many urls that I don't want them to be indexed by Google such as testing pages, auto generated pages... Is there way to remove certain URL from the XML sitemap or is this impossible? If impossible, is the only way to control these urls is to "No index" all these pages that i don't want the search engine to see? Thanks Mozzers,
Technical SEO | | Ideas-Money-Art0 -
Sitemaps for Google
In Google Webmaster Central, if a URL is reported in your site map as 404 (Not found), I'm assuming Google will automatically clean it up and that the next time we generate a sitemap, it won't include the 404 URL. Is this true? Do we need to comb through our sitemap files and remove the 404 pages Google finds, our will it "automagically" be cleaned up by Google's next crawl of our site?
Technical SEO | | Prospector-Plastics0 -
XML Sitemap without PHP
Is it possible to generate an XML sitemap for a site without PHP? If so, how?
Technical SEO | | jeffreytrull11