Skip to content
Search engines 5511dd3

Testing How Crawl Priority Works

M

The author's views are entirely their own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Table of Contents

M

Testing How Crawl Priority Works

The author's views are entirely their own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

A SHORT INTRODUCTION...

We all know that the search engine robots more frequently visit popular pages, i.e. those that have the largest number of incoming links, both internal and external ones. The architecture of a website is usually correlated with the popularity of these pages expressed by number of backlinks:

  • Home page has the most backlinks,
  • 1st (e.g. product categories), 2nd & 3rd level pages obtain less links,
  • finally the least important are deep pages (with articles, classified ads, product pages, etc).

The above mentioned “importance” of web pages versus the web site architecture has been illustrated in one of the Rand's posts titled "Diagrams for Solving Crawl Priority & Indexation Issues":

Typical Site's Link Earning Potential by Content Section

Important pages tend to have a different priority of indexation, and this was also presented very nicely by Rand:

Spider Crawl Priority Paths Graphic

Purple spots are those with the highest number of external links. As it can be seen, the pages which are close, take some of the popularity and they pass part of it further (pink spots). All the other spots stand for pages that are too far from the entrance points of search engine robots, which means that the chance of their indexation is much smaller.

In case of classified websites, which contain a lot of content, the above diagram should include subsequent category listing or search results pages. They are obviously less important than the main category pages, but their indexing additionally influences the indexation of their components - ad details pages. This is particularly important when the listing starts with so called premium ads, which change less often than standard classifieds.

BEFORE THE TEST...

Having this theoretical information, we have decided to see how it is like in practice. We have analyzed a website of http://www.morusek.pl (with animals and pets related classifieds from Poland) which has a total number of indexed pages exceeding 100,000. Using the combination of "site" and "inurl" queries we checked what is the number of indexed pages with a list of classifieds (in Polish “ogloszenia”): http://www.google.pl/search?q=site%3Awww.morusek.pl+inurl%3A%22%2F0%2F%22+inurl%3Aogloszenia

The initial results were the following:

Indexation status in Google of ad listing pages of Morusek.pl

To continue the analysis, we excluded the first pages, as the numbers here are influenced by existence of some category pages with no classifieds at the moment, but which are indexable (there are crawlable links in the menu). In addition, to verify the effectiveness of the "site" query, we took into account a number of pages reported by Google Webmaster Tools (GWT) under "Internal Links". The results were as follows:

Indexation of ad listing pages

WHAT'S IMPORTANT TO KNOW?

The first conclusion is obviously that the higher the page number is, the less probability that the page will be indexed. Secondly, while the actual numbers of GWT and “site” queries vary a lot, the trends (slopes) are almost the same. On average, the chance that the robot will crawl to the next page of search results decreases by 1,2-1,3% per page.

It is also interesting that, according to Google Webmaster Tools, pages from 2 to 4 have a good indexation ratio which later decreases dramatically at the fifth position. For example, for sites with number 4 the level of indexation is 60%, while for pages number 15 it falls below 30% (according to Google Webmaster Tools), or 40% (for the command “site” in Google). This is due to the fact that Googlebots have a much longer way to reach the appropriate link in case of the latter (a link to page 15 first appears on page 12), while there are direct links to pages 2, 3 and 4 on the first pages of search listings (see below):

Pagination links of Morusek.pl before introducing the change

THE SUBJECT OF THE TEST: INTRODUCING MORE LINKS

We decided to test what would be the changes in indexation ratios if we introduced more links to subsequent ad listings pages. On the first page of each category we added links to the 5th, 10th and 15th pages as show on the picture below:

Pagination links on Morusek.pl after the change

After a month we tested the changes. Due to inaccurate results returned by the command “site” in Google (number of indexed pages seemed to be greater than the actual number of them) we present data from Google Webmaster Tools (internal links) only:

Comparison of before and after changes of indexation of ad listing pages

THE RESULTS

The graph clearly shows us that indexation of pages that were added to the listing on the first page is much higher after the change (pages: 5th, 10th and 15th), and actually equals the indexation of pages 2, 3 and 4.

However, the increase in indexation of pages directly linked from the home page did not affect the indexation of the neighbouring pages. For example, we can see a huge increase for page 10, but there is no change for pages 9 and 11. The conclusion is that for Googlebots these pages are too far from the points of entry. Only category pages for main region have incoming links. To index page 9 of the intersection of categories and regions, the robots would have to go the following path:

  1. main category page (entry point),
  2. category page + region (first page of results),
  3. category page + region (tenth page of results),
  4. category page + region (page 9  of the results).

What makes it even worse, not all the category pages have incoming links.

THE CONCLUSIONS

For classifieds or e-commerce websites, the conclusion is that the more pages linked in the listing, the greater the chance that they will be indexed. In general, it is clear that the farther from the point of entry (external link), the less chance that the page will be indexed. Therefore, it is advisable not to create sites with a very deep structure and to remember that the pages far from the points of entry should be additionally linked to (for example as "similar products", "see also", "related categories", etc.).

Looking at the chart we can see yet another change – a slight decrease in indexation of pages 2, 3 and 4. This can be either because there are new pages added recently and they have not been indexed yet (when the number of ads in a certain category has started to exceed the space on the first page), or due to increase in the number of outcoming links on the first page. I would rather bet the first explanation, because in fact the new links were added to a small percentage of pages. There are only 400 fifth pages (so the links to fifth pages were placed on 0,5% of all the first pages). Pages 10 and 15 are even less numerous.

Introduction of additional links has not increased the level of indexation of classifieds, however I suppose that the rate of change was simply too small to affect their indexation. Moreover, the indexation of ads of Morusek.pl exceeded already 80% when the experiment started. Such changes can produce a visible increase in the number of indexed pages in case of sites where the rate of change is much higher and the level of indexation of classifieds or products - lower.

Back to Top
M
Maciej Galecki is CEO of Bluerank, the leading Polish SEM agency which focuses on delivering SEO, PPC and Web analytics services to businesses all over the world with a special focus on the classified industry. Bluerank is certified Google Qualified Company and Google Analytics Authorized Consultant while Maciej has individual certificates of GAP & GAIQ.

With Moz Pro, you have the tools you need to get SEO right — all in one place.

Read Next

How to Optimize E-commerce Sitemaps with 1M+ Pages — Whiteboard Friday

How to Optimize E-commerce Sitemaps with 1M+ Pages — Whiteboard Friday

May 17, 2024
7 Ways SEO and Product Teams Can Collaborate to Ensure Success

7 Ways SEO and Product Teams Can Collaborate to Ensure Success

Apr 24, 2024
6 Things SEOs Should Advocate for When Building a Headless Website — Whiteboard Friday

6 Things SEOs Should Advocate for When Building a Headless Website — Whiteboard Friday

Apr 19, 2024

Comments

Please keep your comments TAGFEE by following the community etiquette

Comments are closed. Got a burning question? Head to our Q&A section to start a new conversation.