How to allow bots to crawl all but WP-content
-
Hello,
I would like my website to remain crawlable to bots, but to block my wp content and media. Does the following robots.txt work? I worry that the * user agent may conflict with the others.
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/User-agent: GoogleBot
Allow: /User-agent: GoogleBot-Mobile
Allow: /User-agent: GoogleBot-Image
Allow: /User-agent: Bingbot
Allow: /User-agent: Slurp
Allow: / -
Thank you for the help, Gaston!
-
Yeap, with that you are allowing every file ending with that extension
-
Can I do so with:
Allow: *.jpg
Allow: *.png
-
Thanks, Gaston. I should have been more clear about what I am looking to do. I currently am having an indexation issue. Somehow, pages are being automatically generated by WordPress.
These pages are often .txt files of information or code from plugins, all beginning with /wp-content/uploads/ in their URL. I have been manually removing them from the index and would like to now have them be uncrawlable.
Best
-
Oh god, my mistake!
Im deeply sorry, yes, this configuration will block images! that follow that folder structure!I'll correct myself.
Thanks for pointing it out! -
Gaston,
Thanks for the fast reply! My images folder does follow that format, which is what makes me worrisome as we are blocking the wp-conent folder.
Thanks!
-
Hi Tom,
Yes, this config will allow images to be crawled,
No, this config will block images to be crawled,as long as your wordpress has the defalt folder for images: /wp-content/uploads/year/month/image-name.png
How to know, super easy, where your images are stored? Go to the web where you can find an image... Then right clic and then copy link address. With that link you will find that folder structure.
Hope it helps.
Best luck.
GR -
Hi Gaston,
I just wanted to follow up with you with one last question if possible. Would this allow my images and PDF's to be crawled & indexed still?
Thanks!
-
Awesome. Thanks, Gaston!
-
Yes it does.
As I said earlier. Copy and paste that code into the robot.txt tester in any of your search console and try with some name.css or testing.js just for testing.
Check the image i've attached.Hope it helps.
Best luck
GR -
Thank you for the response. I'm still a little uncertain, does the version you wrote allow the bots to crawl the css and js as well?
Best
-
Hi Tom!
That Robots.txt config is pretty redundant.
To acheive what you what, thy this:User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Allow: *.js
Allow: *.cssJust 3 things to note here:
1- That User-agent:* and those disallows blocks for every bot to crawl whats in those folders.
2- When blocking /wp-content/ you are also blocking the /themes/ folder and inside are the .js and .css files. Blocking those files cause to googlebot not being able to render correctly that page and see it different from what a normal user would see.
3- Those Allow:/ dont prevent the disallow.To try that configuration, you can use the robots.txt tester in search console, just inder the Crawl menu.
Remember that by default google considers that you are not blocking nothing.
More info here: The web robots.tat pageHope it helps.
Best luck.
GR
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hide mobile content element
Hello, We are optimizing our mobile (responsive) website at this moment and we want to change some elements on the productpage of our webshop in order of rank. This is the outcome of a data analysis which proves a change in rank of order will result in a higher conversion rate. No doubt. By technical limitations there is no other option but duplicating the element 'product description' in the source of the page to be able to show in on the spot on the product page we like to. Because our webshop is responsive we can not just move the element to the right spot for mobile because the tablet and desktop version then will change as well. My question is: will it be a problem for Google if we hide the original element of the product description on mobile pages by using the bootstrap class "hidden-xs" and duplicate it on another spot in the page to show it with the "visible-xs" class? My concern is that this will create duplicate content and hiding content for Google is not particularly good. On the other hand, I think Google is smart enough to understand that this is not to manipulate visitors or rankings, but this is only for a different look of the mobile website. I hope you guys can give me some good advice.
Technical SEO | | MarcelMoz
Thanks in advance.
Marcel0 -
Duplicate content or an update ???
Buying Guide and Product Category page competing for the same keyword ? Got a “nuts and bold website” selling basic stuff. Imagine selling simple nuts, bolts and washers (the little ring that goes in between) in different metals. Imagine a website with a very wide and deep line of these simple products. For long tail keywords we rank well (Example: 0.25 inch bolts). For the keyword: “Nuts bolts” our main category page use to rank well low 1<sup>st</sup> page to second page up against the big guys (Amazon, Walmart, Target, Costco, some drug store who may have a mix pack of nuts and bolts, but still Google don’t see the difference and list 2 pages each for these guys). But then in mid-February there were an update and suddenly our “Buying guide for nuts and bolts” rank higher and started to compete with our own product category page. That was never our intention. These two pages now compete for the ranking on page 4<sup>th</sup>. Clearly there were more words on the buying guide page but no changes had been made to it for well months or years. To make up for it some more words were added to the category page, but of cause there is only so many way you can fraise words about “nuts and bolts” without sounding a bit duplicate/re-writing. So what do I do now ?? Clearly the product category page is the one we like to rank highest with the guide a close 2nd. Most customer don’t need the buying guide but it is good to have and great support as we got lot of good comments from customer who read it. Made a link to the buying guide from the category page and wise verses. The category page got an embedded video. Moz list the page authority for the category page to 16 and 1 for the buying guide but clearly G see it differently. Already tried to change the Meta Tag Title and Description a little but it is hard to do if the word “Nuts Bolts” is to appear in the description or people don’t know what to expect. Could just insert a “do not index” for the buying guide but not a good long term solution. Unfortunately I am out of imagination at this point. Any good suggestions ?? Thanks, Kim Any good suggestions ???
Technical SEO | | KimX0 -
What is the process for allowing someone to publish a blog post on another site? (duplicate content issue?)
I have a client who allowed a related business to use a blog post from my clients site and reposted to the related businesses site. The problem is the post was copied word for word. There is an introduction and a link back to the website but not to the post itself. I now manage the related business as well. So I have creative control over both websites as well as SEO duties. What is the best practice for this type of blog post syndication? Can the content appear on both sites?
Technical SEO | | donsilvernail0 -
Duplicate content problem
Hi, i work in joomla and my site is www.in2town.co.uk I have been looking at moz tools and it is showing i have over 600 pages of duplicate content. The problem is shown below and i am not sure how to solve this, any help would be great, | Benidorm News http://www.in2town.co.uk/benidorm-news/Page-2 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-102 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-103 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-104 9 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-106 28 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-11 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-112 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-114 45 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-115 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-116 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-12 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-120 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-123 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-13 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-130 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-131 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-132 31 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-140 4 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-141 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-21 10 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-22 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-23 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-26 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-271 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-274 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-277 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-28 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-29 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-310 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-341 21 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-342 4 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-343 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-345 1 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-346 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-348 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-349 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-350 50 16 0 In2town http://www.in2town.co.uk/blog/In2town/Page-351 50 19 1 In2town http://www.in2town.co.uk/blog/In2town/Page-82 24 1 0 In2town http://www.in2town.co.uk/blog/in2town 50 20 1 In2town http://www.in2town.co.uk/blog/in2town/Page-10 50 23 3 In2town http://www.in2town.co.uk/blog/in2town/Page-100 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-101 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-105 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-107 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-108 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-109 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-110 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-111 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-113 |
Technical SEO | | ClaireH-1848860 -
Duplicate Content
Hi, I'm working on a site and I'm having some issues with its structure causing duplicate content. The first issue is that the search pages will show up as duplicates.
Technical SEO | | OOMDODigital
A search for new inventory may be new.aspx
The duplicate may be something like new.aspx=page1, or something like that and so on. The second issue is with inventory. When new inventory gets put into the stock of the store, a new page for that item will be populated with duplicate content. There appears to be no canonical source for that page. How can I fix both of these? Thanks!0 -
Fixing Crawl Errors
Hi! I moved my Wordpress blog back in August, and lost much of my site traffic. I recently found over 1000 crawl errors in Webmaster Tools because some of my redirects weren't transferred, so we are working on fixing the errors and letting Google know. I'm wondering how long I should expect for Google to recognize that the errors have been fixed and for the traffic to start returning? Thanks! Jodi - momsfavoritestuff.com
Technical SEO | | JodiFTM0 -
Canonical usage and duplicate content
Hi We have a lot of pages about areas like ie. "Mallorca" (domain.com/Spain/Mallorca), with tabbed pages like "excursion" (domain.com/spain/Mallorca/excursions) and "car rental" (domain.com/Spain/Mallorca/car-rental) etc. The text on ie the "car rental"-page is very similar on Mallorca and Rhodos, and seomoz marks these as duplicate content. This happens on "car rental", "map", "weather" etc. which not have a lot of text but images and google maps inserted. Could i use rel=nex/prev/canonical to gather the information from the tabbed pages? That could show google that the Rhodos-map page is related to Rhodos and not Mallorca. Is that all wrong or/and is there a better way to do this? Thanks, Alsvik
Technical SEO | | alsvik0 -
Magento and Duplicate content
I have been working with Magento over the last few weeks and I am becoming increasingly frustrated with the way it is setup. If you go to a product page and remove the sub folders one by one you can reach the same product pages causing duplicate content. All magento sites seem to have this weakness. So use this site as an example because I know it is built on magento, http://www.gio-goi.com/men/clothing/tees/throve-t-short.html?cid=756 As you remove the tees then the clothing and men sub folders you can still reach the product page. My first querstion is how big an issue is this and two does anyone have any ideas of how to solve it? Also I was wondering how does google treat question marks in urls? Should you try and avoid them unless you are filtering? Thanks
Technical SEO | | gregster10001