Negative impact on crawling after upload robots.txt file on HTTPS pages
-
I experienced negative impact on crawling after upload robots.txt file on HTTPS pages. You can find out both URLs as follow.
Robots.txt File for HTTP: http://www.vistastores.com/robots.txt
Robots.txt File for HTTPS: https://www.vistastores.com/robots.txt
I have disallowed all crawlers for HTTPS pages with following syntax.
User-agent: *
Disallow: /Does it matter for that? If I have done any thing wrong so give me more idea to fix this issue.
-
Hi CP,
If you wish to use robots.txt to block crawlers, then your two robots.txt files should be as follows:
For your http protocol (http://vistastores.com/robots.txt
User-agent: * Allow: /
For the https protocol (https://vistastores.com/robots.txt
User-agent: * Disallow: / Personally, I prefer to use the noindex meta tag for page blocking because it is a more reliable way of ensuring that the pages are not indexed. (Never try to use both at once) This link explains the difference between the two: [Google Webmaster Tools Help.](http://www.google.com/support/webmasters/bin/answer.py?answer=35302 "Robots blocking crawlers") Hope that helps, Sha ```You can use a robots.txt file to request that search engines remove your site and prevent robots from crawling it in the future. (It's important to note that if a robot discovers your site by other means - for example, by following a link to your URL from another site - your content may still appear in our index and our search results. To entirely prevent a page from being added to the Google index even if other sites link to it, use a [noindex meta tag](http://www.google.com/support/webmasters/bin/answer.py?answer=61050).)
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
From http to https
Hi Guys, I have mixed http and https content on my ecommerce store. My server people is telling me force all to https as it is better for ssl certificate. All versions of the site are declared on search console. Can forcing https and not having more mixed content impact my site badly ? Thanks.
Intermediate & Advanced SEO | | Kepass0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Duplicate Page Content Issues Reported in Moz Crawl Report
Hi all, We have a lot of 'Duplicate Page Content' issues being reported on the Moz Crawl Report and I am trying to 'get to the bottom' of why they are deemed as errors... This page; http://www.bolsovercruiseclub.com/about-us/job-opportunities/ has (admittedly) very little content and is duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/explorer-of-the-seas-2015/ This page is basically an image and has just a couple of lines of static content. Also duplicated with; http://www.bolsovercruiseclub.com/cruise-lines/costa-cruises/costa-voyager/ This page relates to a single cruise ship and again has minimal content... Also duplicated with; http://www.bolsovercruiseclub.com/faq/packing/ This is an FAQ page again with only a few lines of content... Also duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/exclusive-canada-&-alaska-cruisetour/ Another page that just features an image and NO content... Also duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/free-upgrades-on-cunard-2014-&-2015/?page_number=6 A cruise deals page that has a little bit of static content and a lot of dynamic content (which I suspect isn't crawled) So my question is, is the duplicate content issued caused by the fact that each page has 'thin' or no content? If that is the case then I assume the simple fix is to increase add \ increase the content? I realise that I may have answered my own question but my brain is 'pickled' at the moment and so I guess I am just seeking assurances! 🙂 Thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Show parts of page A on page B & C?
Good afternoon,
Intermediate & Advanced SEO | | rayvensoft
A quick question. I am working on a website which has a large page with different sections. Lets say: Page 1
SECTION A
SECTION B
SECTION C Now, they are adding a new area where they want to show only certain sections, so it would look like this: Page 2
SECTION A Page 3
SECTION C Page 4
SECTION D So my question is, would a rel='canonical' tag back to Page 1 be the correct way of preempting any duplicate content issues? I do not need Page 2-4 to even be indexed, it is just a matter of usability and giving the users what they are looking for without all the rest of the extra stuff. Gracias. Tesekürler. Salamat Ko. Thanks. (bonus thumbs up for anybody who knows which languages each of those are) 🙂0 -
How can you indexed pages or content on pages that are behind a pay wall or subscription login.
I have a client that has a boat of awesome content they provide to their client that's behind a pay wall ( ie: paid subscribers can only access ) Any suggestions mozzers? How do I get those pages index? Without completely giving away the contents in the front end.
Intermediate & Advanced SEO | | BizDetox0 -
Is there any delay between crawling a page by google and displaying of the ratings in rich snippet of the results in google?
Is there any delay between crawling a page by google and displaying of the ratings in rich snippet of the results in google?
Intermediate & Advanced SEO | | NEWCRAFT0 -
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash.
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash. Should we rewrite to just the trailing slash or without because of duplicates. The other question is, if we do a rewrite, google has indexed some pages with the slash and some without - i am assuming we will lose rank for one of them once we do the rewrite, correct?
Intermediate & Advanced SEO | | Profero0 -
Crawl questions
My first website crawl indicating many issues. I corrected the issues, requested another crawl and received the results. After viewing the excel file I have some questions. 1. There are many pages with missing Titles and Meta Descriptions in the Excel file. An example is http://www.terapvp.com/threads/help-us-decide-on-terapvp-com-logo.25/page-2 That page clearly has a meta description and title. It is a forum thread. My forum software does a solid job of always providing those tags. Why would my crawl report not show this information? This occurs on numerous pages. 2. I believe all my canonical URLs are properly set. My crawl report has 3k+ records, largely due to there being 10 records for many pages. These extra records are various sort orders and style differences for the same page i.e. ?direction=asc. My need for a crawl report is to provide actionable data so I can easily make SEO improvements to my site where necessary. These extra records don't provide any benefit. IF the crawl report determined there was not a clear canonical URL, then I could understand. But that is not the case. An example is http://www.terapvp.com/forums/news/ If you look at the source you will clearly see Where is the benefit to including the 10 other records in the Crawl report which show this same page in various sort orders? Am I missing anything? 3. My robots.txt appropriately blocks many pages that I do not wish to be crawled. What is the benefit to including these many pages in the crawl report? Perhaps I am over analyzing this report. I have read many articles on SEO, but now that I have found SEOmoz, I can see I will need to "unlearn what I have learned". Many things such as setting meta keyword tags are clearly not helpful. I wish to focus my energy and I was looking to the crawl report as my starting point. Either I am missing something, or the report design needs improvement.
Intermediate & Advanced SEO | | RyanKent0