Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WordPress posts Title field inserts title into blog posts like a headline but doesn't ad H1 tag how to change?
I have a Wordpress website which is just using the Default theme, when I post in the blog, whatever I put in the "Title" field at the top of the editor is automatically is placed within the body of the blog post, like a headline, but it doesn't include any H1 tags that I can see. If I add my own headline within in the blog editor, it still inserts the Title like a headline. I am using the Yoast SEO Plugin and also write the meta title there, should I just leave the Wordpress title field blank so it doesn't insert into the blog post? Or is that inserted Title being recognized as an H1 even though I don't see h1 tags anywhere? Hope this isn't too confusing.
Intermediate & Advanced SEO | | SEO4leagalPA1 -
How do the Quoras of this world index their content?
I am helping a client index lots and lots of pages, more than one million pages. They can be seen as questions on Quora. In the Quora case, users are often looking for the answer on a specific question, nothing else. On Quora there is a structure setup on the homepage to let the spiders in. But I think mostly it is done with a lot of sitemaps and internal linking in relevancy terms and nothing else... Correct? Or am I missing something? I am going to index about a million question and answers, just like Quora. Now I have a hard time dealing with structuring these questions without just doing it for the search engines. Because nobody cares about structuring these questions. The user is interested in related questions and/or popular questions, so I want to structure them in that way too. This way every question page will be in the sitemap, but not all questions will have links from other question pages linking to them. These questions are super longtail and the idea is that when somebody searches this exact question we can supply them with the answer (onpage will be perfectly optimised for people searching this question). Competition is super low because it is all unique user generated content. I think best is just to put them in sitemaps and use an internal linking algorithm to make the popular and related questions rank better. I could even make sure every question has at least one other page linking to it, thoughts? Moz, do you think when publishing one million pages with quality Q/A pages, this strategy is enough to index them and to rank for the question searches? Or do I need to design a structure around it so it will all be crawled and each question will also receive at least one link from a "category" page.
Intermediate & Advanced SEO | | freek270 -
Drop in Indexed pages
Hope everyone is having an Awesome December! I first noticed a drop in my index in the beginnings of November. My site drop in indexed pages from 1400 to 600 in the past 3-4 weeks. I don't know the cause of it, and would like the community to help me figure out why my indexing has dropped. Thank you for taking time out of your schedule to read this.
Intermediate & Advanced SEO | | BSC0 -
Page not appearing in SERPs
I have a regional site that does fairly well for most towns in the area (top 10-20). However, one place that has always done OK and has great content is not anywhere within the first 200. Everything looks OK, canonical link is correct, I can find the page if I search for exact text, there aren't any higher ranking duplicate pages. Any ideas what may have happened and how I can confirm a penalty for example. TIA,
Intermediate & Advanced SEO | | Cornwall
Chris0 -
Best to Post Dynamic Content (Listings) under "Posts" in Wordpress?
My commercial real estate web site is being migrated to Wordpress from Drupal. Is it advisable to place dynamic content that will use taxonomy under "Posts" ? Listings will be changed every few months and there could be anywhere from several hundred to several thousand of them on the site. Developers have given me different advice. One has been adamant that listings and neighborhood pages (there will be about 25 neighborhood pages) should not be in the post section which is to be strictly reserved for blog entries. The last thing I want is to create a site structure which is unfriendly to SEO!!!! I would very much appreciate the perspective of anyone proficient with Wordpress and SEO. Thanks!!!
Intermediate & Advanced SEO | | Kingalan1
Alan Rosinsky0 -
Does Google Index an Alert Div w/Delayed Hide
We have a div at the top of a client's the page that displays an alert to the user. After 30 seconds it is rendered hidden. Does Google index this? Does Google take this into account when it ranks the page?
Intermediate & Advanced SEO | | WEOMedia0 -
/%category%/%postname%/ Permalink structure
Mostly everyone seems to agree that /%category%/%postname%/ is the best blog structure. I'm thinking of changing my structure to that because now it's structured by date which is bad. But almost all of my posts are assigned to more than one category. Won't this create duplicate pages?
Intermediate & Advanced SEO | | UnderRugSwept0 -
Is there a FastTrack to re-index? a site?
Hello... i just started with a new client this week, before working with us his last domain-hosting-webdev provider cancel their account and took off the entire site and left them with a nice "under construction page" (NOT) and added the noindex, nofollow tags. 4 weeks after that, we come into the scene and of course our client it's expecting us to reinsert at least for branded terms the site, and he wants it done on a matter of hours... I tried my best to explain that it's not possible and we are doing everything we can't.... now i ask you guys.. I already created de GWT account, Created a well structured Sitemap and submitted it to google and bing, did the onpage optimizitation at least the basics... there is a way to speed up the process? kind of like "hey you! google bot, forget about the noindex nonsense a come crawl again?" Any help would be great Daniel
Intermediate & Advanced SEO | | daniel.alvarez0