Robots.txt question
-
I notice something weird in Google robots. txt tester
I have this line
Disallow: display=
in my robots.text but whatever URL I give to test it says blocked and shows this line in robots.text
for example this line is to block pages like
http://www.abc.com/lamps/floorlamps?display=table
but if I test
http://www.abc.com/lamps/floorlamps or any page
it shows as blocked due to Disallow: display=
am I doing something wrong or Google is just acting strange? I don't think pages with no display= are blocked in real.
-
Yes - there is bug in your robots.txt. You should wrote some as:
Disallow: /?display=table
or:
Disallow: /?display=*
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate H1 Question & Landing Page help
Hi We have 2 H1's on this page http://www.key.co.uk/en/key/heavy-duty-shelving Our webmaster has put one as display:none - but isn't this just going to look like we're keyword spamming & trying to hide it? OK now I;m looking I am seeing more wrong with this page... The width buttons at the top as h2's...& they link to facet pages? Won't this just waste crawl budget? and every product title/user guide title etc are all H2's.... I just need to put a plan together to give to our dev team on what should be updated Any tips would be great. Becky
Intermediate & Advanced SEO | | BeckyKey0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Http to https question (SSL)
Hi, I recently made two big changes to a site - www.aerlawgroup.com (not smart, I know). First, I changed from Weebly to Wordpress (WP Engine hosting with CDN + Cloudflare - is that overkill?) and I added SSL (http to https). From a technical perspective, I think I made a better site: (1) blazing fast, (2) mobile responsive, (3) more secure. I'm seeing the rankings fluctuate quite a bit, especially on the important keywords. I added SSL to my other sites, and saw no rankings change (they actually all went up slightly). I'm wondering if anyone has had experience going to SSL and can give me feedback on something I might have overlooked. Again, it's strange that all the other sites responded positively, but the one listed above is going in the opposite direction. Maybe there are other problems, and the SSL is just a coincidence. Any feedback would be appreciated. I followed this guide: http://moz.com/blog/seo-tips-https-ssl - which helped tremendously (FYI).
Intermediate & Advanced SEO | | mrodriguez14400 -
SEO Question re: Keyword Cannibalization
I know about Keyword Cannibalization, so I understand why it's generally a problem. If you have multiple versions of the same page, Google has to "guess" which one to display (as I understand it, unless you have a SUPER influential page you won't get both pages showing up on the SERP). To explain why I'm not sure if this applies to our page, we have a blog that we write about employment law issues on. So we might have 20 blog posts over the past year that all talk about recent pregnancy discrimination lawsuits employers might be interested in. Now, searching the Google Keyword tools, there aren't even close to 20 different focus keywords that would make any sense. "Pregnancy Discrimination lawsuit" is niche enough for us to be competitive, but anything more specific than that simply has very little search activity. My suggestion is to just optimize all of them for "pregnancy discrimination lawsuit". My understand of how Panda works is that if the content is different on each page (and it is!) then it will only display what it guesses is the most relevant "NLRB" post, but any link juice sent to the other 19 "NLRB" posts would still boost the relevancy for whatever post Google chooses. And it wouldn't get dinged as keyword stuffing because it's clearly not just the same page repeated over and over. I've found quite a few articles on Keyword Cannibalization but many are pre-Panda. I was CERTAIN I'd seen a post that explained my idea is a totally viable and good one, but of course now I can't find it. So before I go full steam ahead with this strategy I just want to make sure there's nothing I'm missing. Thanks!
Intermediate & Advanced SEO | | CEDRSolutions0 -
Question spam malware causing many indexed pages
Hey Mozzers, I was speaking with a friend today about a site that he has been working on that was infected when he began working on it. Here (https://www.google.ca/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=site:themeliorist.ca) you can see that the site has 4400 indexed pages, but if you scroll down you will see some pages such as /pfizer-viagra-samples/ or /dapoxetine-kentucky/. All of these pages are returning 404 errors, and I ran it through SEO spider just to see if any of these pages would show up, and they don't. This is not an issue for a client, but I am just curious why these pages are still hanging around in the index. Maybe others have experience this issue too. Cheers,
Intermediate & Advanced SEO | | evan890 -
Meta NoIndex tag and Robots Disallow
Hi all, I hope you can spend some time to answer my first of a few questions 🙂 We are running a Magento site - layered/faceted navigation nightmare has created thousands of duplicate URLS! Anyway, during my process to tackle the issue, I disallowed in Robots.txt anything in the querystring that was not a p (allowed this for pagination). After checking some pages in Google, I did a site:www.mydomain.com/specificpage.html and a few duplicates came up along with the original with
Intermediate & Advanced SEO | | bjs2010
"There is no information about this page because it is blocked by robots.txt" So I had added in Meta Noindex, follow on all these duplicates also but I guess it wasnt being read because of Robots.txt. So coming to my question. Did robots.txt block access to these pages? If so, were these already in the index and after disallowing it with robots, Googlebot could not read Meta No index? Does Meta Noindex Follow on pages actually help Googlebot decide to remove these pages from index? I thought Robots would stop and prevent indexation? But I've read this:
"Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”. I'm a bit confused about how to use these in both preventing duplicate content in the first place and then helping to address dupe content once it's already in the index. Thanks! B0 -
Spammy Link Profile Questions. What do you think?
I'm trying to dilute the link profile for a website. But have a couple of questions on the best way to achieve this. Current link profile, www.mysitename.com Keyword 1 Keyword 2 Keyword 3 Keyword 4 Keyword 5 Keyword 6 Keyword 7 Keyword 8 Keyword 9 Keyword 10 Keyword 12 Keyword 13 Keyword 14 Keyword 14 Keyword 15 mysitename.com Desired link profile, www.mysitename.com mysitename.com www.mysitename.com http://www. mysitename.com/ My Site Name http://mysitename.com Click Here my site name More Info mysitename.com/ www.mysitename.com/ Keyword 1 Keyword 2 Keyword 3 Keyword 4 Keyword 4 Keyword 5 Questions 1. Do you think Google looks at this on a domain level? Or do you think this needs to be done with every page on the site? 2. What would be a good way to build links fast to the pages, need to build lots of links to be able to dilute the profile. I was considering Dripable, or a similar service, but decided i really don't want to create more spam.What would you do? 3. What would you say the % threshold for anchor text is, i have read on different sources that at least 40% - 60% of links should be branded, url, or generic anchor links. Do you think this is accurate?
Intermediate & Advanced SEO | | 858-SEO0 -
Archive or no archive?... That is the question!
When running a classified site, what is best practice for what to do with expired ads? Should they stay on the site with a sold stamp perhaps? Or should they be moved to an archive subdomain, with the original URL 301 redirecting to the new archive ad? I'm kinda thinking the second option but I suppose the only issue with this is you would have to have a consistent flow of new ads on the site to prevent categories from getting too thin. Thoughts on this and any other/better solutions would be much appreciated. Thanks.
Intermediate & Advanced SEO | | Sayers0