XML Sitemap Issue or not?
-
Hi Everyone,
I submitted a sitemap within the google webmaster tools and I had a warning message of 38 issues.
Issue: Url blocked by robots.txt.
Description: Sitemap contains urls which are blocked by robots.txt.
Example: the ones that were given were urls that we don't want them to be indexed: Sitemap: www.example.org/author.xml
Value: http://www.example.org/author/admin/
My issue here is that the number of URL indexed is pretty low and I know for a fact that Robot.txt aren't good especially if they block URL that needs to be indexed. Apparently the URLs that are blocked seem to be URLs that we don't to be indexed but it doesn't display all URLs that are blocked.
Do you think i m having a major problem or everything is fine?What should I do? How can I fix it?
FYI: Wordpress is what we use for our website
Thanks
-
Hi Dan
Thanks for your answer. Would you really recommend using the plugin instead of just uploading the xml sitemap directly to the website's root directory? If yes why?
Thanks
-
Lisa
I would honestly switch to the Yoast SEO plugin. It handles the SEO (and robots.txt) a lot better, as well as the XML sitemaps all within that one plugin.
I'd check out my guide for setting up WordPress for SEO on the moz blog.
Most WP robots.txt files will look like this;
User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/
And that's it.
You could always just try changing yours to the above setting first,
before switching to Yoast SEO - I bet that would clear up
the sitemap issues.
Hope that helps!
-Dan ```
-
Lisa, try checking manually which URL is not getting indexed in Google. Make sure you do not have any no follows on those pages. If all the pages are connected / linked together, then Google will crawl your whole site eventually, just a matter of time.
-
Hi
when generating sitemap there are 46 URLs detected by xml-sitemaps.com but when adding the sitemap to WMT only 12 get submitted and 5 are indexed which is really kind of worrying me. This might be because of the xml sitemap plugin that I installed. May be something is wrong with my settings(doc attached 1&2)
I am kind of lost especially that SEOmoz hasn't detected any URLs blocked by Robot.txt
It would be great if you could tell me what should I do next ?
Thanks
-
The first question i would ask is how big is the difference. If the difference is a large in the # of pages on your site and the ones indexed by Google, then you have an issue. The blocked pages might be the ones linking to the ones that have not been indexed and causing issues. Try removing the no follow on those pages and then resubmit your sitemap and see if that fixes the issue. Also double check your site map to make sure you have correctly added all the pages in it.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap: Linking horizontal pages on a sitemap that has a vertical hierarchy structure
I'm currently in the process of revamping a website and creating a sitemap for it so that all pages get indexed by search engines. The site is divided into two websites that share the same root domain. The marketing site is on example.com and the application is on go.example.com. To get to go.example.com from example.com, you need to go through one of three “action pages”. The action pages are accessed from every page on example.com where we have a CTA button on the site (that’s pretty much every page). These action pages do not link back to any other page on the site though, nor are they a necessary step to navigate to other webpages. These action pages are only viewed when a user is ready to be taken to the application site. My question is, how should these pages be set up in a vertical sitemap since these three pages have a horizontal structure? Any insight would be much appreciated!
Technical SEO | | RallyUp0 -
Include or exclude noindex urls in sitemap?
We just added tags to our pages with thin content. Should we include or exclude those urls from our sitemap.xml file? I've read conflicting recommendations.
Technical SEO | | vcj0 -
Http and https issue in Google SERP
Hi, I've noticed that Google indexing some of my pages as regular http, like this: http://www.example.com/accounts/ and some pages are being indexed as https, like this: https://www.example.com/platforms/ When I've performed site audit check in various SEO tools I got something around +450 pages duplicated and showing me pairs of the same URL pages, one time with http and one time with https. In our site there is the possibility for people to register and and open an account, later on to login to our website with their login details. In our company I'm not the one that is responsible for the site's maintenance and I would like to know if this is an issue, and if this is an issue - to know what causing it and how to fix it so I'll be able to forward the solution to the person in charge. Additionally I would like to know in general, what is the real purpose of https vs. http and to know what is the preferred method that our website should use. Currently when URLs are typed manually to the address bar, all the URLs are loading fine - with or without https written at the start of each URL. I'm not allowed to expose our site's name, this is why I wrote example.com instead, I hope you can understand that. Thank you so much for your help and I'm looking forward reading your answers.
Technical SEO | | JonsonSwartz0 -
Can a Novice Fix Parallelize Issues?
I was working yesterday on making my WP site quicker (sellingwarnerrobins.com) and after updating the htaccess file to solve some "Leverage Browser Caching" issues I re-ran a scan on Pingdom Tools and am now getting a zero for "Parallelize downloads across hostnames" with a list of 34 items to fix. I did some web searches and when the articles started talking about cnames, subdomains, and hostname distribution it went beyond my capabilities. Are these Parallelize "issues" something a novice like myself can easily fix? If so, how?
Technical SEO | | Anita_Clark0 -
Most common SEO issues with Magento
Which issues does Magento provide out of the box, what should i be aware of? What would you recommend to change/fix as soon as possible?
Technical SEO | | Mickelp0 -
Differences in Sitemaps SEO wise?
I'm a bit confused about sitemaps. I'm just learning SEO so forgive me if this is a basic question. I've submitted my site to google webmaster using http://pro-sitemaps.com and the sitemap generator it creates. I've also seen sites do this: http://www.johnlewis.com/Shopping/ProductList.aspx and http://www.thesafestcandles.com/site-map.html so I did something similar for my site (www.ldnwicklesscandles.com). You figure you see everyone do it you might as well try it too and hope it works. 😉 So I've done both 1 and 2. Which sitemap is best for SEO purposes or should I do both? Is there any format that should or shouldn't be used for Option 2? Any site examples for good practice would be helpful.
Technical SEO | | cmjolley0 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Include pagination in sitemap.xml?
Curious on peoples thoughts around this. Since restructuring our site we have seen a massive uplift in pages indexed and organic traffic with our pagination. But we haven't yet included a sitemap.xml. It's an ancient site that never had one. Given that Google seems to be loving us right now, do we even need a sitemap.xml - aside from the analytical benefis in WM Tools? Would you include pagination URL's (don't worry, we have no duplicate content) in the sitemap.xml? Cheers.
Technical SEO | | sichristie0