How to fully index big ecommerce websites (that have deep catalog hierarchy)?
-
When building very large ecommerce sites, the catalog data can have millions of product SKUs and a massive quantity of hierarchical navigation layers (say 7-10) to get to those SKUs. On such sites, it can be difficult to get them to index substantially. The issue doesn’t appear to be product page content issues. The concern is around the ‘intermediate’ pages -- the many navigation layers between the home page and the product pages that are necessary for a user to funnel down and find the desired product. There are a lot of these intermediate pages and they commonly contain just a few menu links and thin/no content. (It's tough to put fresh-unique-quality content on all the intermediate pages that serve the purpose of helping the user navigate a big catalog.) We've played with NO INDEX, FOLLOW on these pages. But structurally it seems like a site with a lot of intermediate pages containing thin content can result in issues such as shallow site indexing, weak page rank, crawl budget issues, etc. Any creative suggestions on how to tackle this?
-
Yes, the links should come from your own website.
If you have a powerful site, creating sitewide links to several logical category pages within your product pages can be adequate.
If your site is new or not very strong yet then it may be best to grow the number of product pages in steps as your site is able to get them in the index and hold them in the index. A weak site will probably not be able to get 5,000,000 pages indexed. If your site is not powerful, attempting to do it usually results in a ranking decline on the original part of the site.
-
Thanks for the response. To clarify... you're suggesting we link internally from our highest PR pages to pages deep inside the catalog (ie. product pages)?
-
Link deep into the site at many different internal hubs from high PR pages. That forces spiders into the depths of the site and forces them to chew their way out through unindexed pages. These links must remain in place permanently if you want the site to stay in the index, because if Google goes too long without spidering a page it will forget about it.
A mistake that people often make is to try to place five million pages on a PR3 website. That will not work. Not enough spiders coming in. For a site like you are talking about you might need many dozen healthy PR6 links or hundreds of PR5 links and quite a bit of prayer. For a site as deep as yours you might need to link to hubs at multiple depths because Google does budget the amount of crawl that they will perform. The spiders will die down there.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is wrong domain being indexed?
We have 2 domains: revolve.com and fwrd.com (unrelated to each other, but hosted on the same server). If you do a site search for revolve.com but enter a designer brand that is only carried on FWRD (not on Revolve), the domain "revolve.com" pops up in the SERP, which is redirected to FWRD.com. Ex. https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=site%3Awww.revolve.com isabel marant Why is Google indexing the revolve.com pages, which don't actually exist? Thanks.
Intermediate & Advanced SEO | | ggpaul5620 -
Ecommerce combating canabilsation
Hey Mozzers, I think i know the answer to this one but i just wanted to check my thinking if you wouldnt mind. I have an ecommerce website with lots of very similar products, for example Blue widget
Intermediate & Advanced SEO | | ATP
Waterproof blue widget
Blue widget with Alarm One of the pages is ranking top 10 for "blue widget", however the other intermittently swap with it, knocking that page out and itself into the top 10. Then a few weeks later it swaps back again. This seems like a clear case of keyword canablisation to me. And i am wondering on the best solution. 301: Obviously not an answer as i need all 3 products visible
Canonical to one of the pages: Doesn't seem correct either, the products are similiar but not the same, all 3 could rank for different longtails etc I was suffering from something similiar on my closely related category pages and I combated that by interlinking them all with the relevant keywords to point to the relevant pages. Should i do the same for these products such as...
From 'Blue Widget' product link to "Blue widget with alarm" and "Waterproof Blue Widget"
From Waterproof blue widget and blue widget with alarm link to "Blue Widget" (using the anchor text in the ""). This should tell serps that all pages are about blue widget but the main one is the "blue widgets" page. Correct? As a follow up. Is this one of the reason ecommerce sights have related products options?0 -
When does Google index a fetched page?
I have seen where it will index on of my pages within 5 minutes of fetching, but have also read that it can take a day. I'm on day #2 and it appears that it has still not re-indexed 15 pages that I fetched. I changed the meta-description in all of them, and added content to nearly all of them, but none of those changes are showing when I do a site:www.site/page I'm trying to test changes in this manner, so it is important for me to know WHEN a fetched page has been indexed, or at least IF it has. How can I tell what is going on?
Intermediate & Advanced SEO | | friendoffood0 -
Problems with a website-help
Soooooo, I did a crawl report on this site : www.greatwesternflooring.com and this was what was on the report. This is a dnn site. I'm guessing the site has a redirect loop given the http status code. Can anyone help me with a fix. (the developers have said there is no redirect on the site......clearly there is....) | http://www.greatwesternflooring.com/ | 2015-01-07T21:32:25Z | 609 : Redirect to already-visited URL received for page request. | Error attempting to request page; see title for details. | 302 | http://www.greatwesternflooring.com | <colgroup><col width="319"> <col width="144"> <col width="378"> <col span="39" width="64"></colgroup>
Intermediate & Advanced SEO | | Britewave
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0 -
Incorrect cached page indexing in Google while correct page indexes intermittently
Hi, we are a South African insurance company. We have a page http://www.miway.co.za/midrivestyle which has a 301 redirect to http://www.miway.co.za/car-insurance. Problem is that the former page is ranking in the index rather than the latter. The latter page does index occasionally in the same position, but rarely. This is primarily for search phrases like "car insurance" and "car insurance quotes". The ranking was knocked down the index with Penquin 2.0. It was not ranking at all but we have managed to recover to 12/13. This abnormally has only been occurring since the recovery. The correct page does index for other search terms like "insurance for car". Your help would be appreciated, thanks!
Intermediate & Advanced SEO | | miway0 -
URL with a # but no ! being indexed
Given that it contains a #, how come Google is able to index this URL?: http://www.rtl.nl/xl/#/home It was my understanding that Google can't handle # properly unless it's paired with a ! (hash fragment / bang). site:http://www.rtl.nl/xl/#/home returns nothing, but: site:http://www.rtl.nl/xl returns http://www.rtl.nl/xl/#/home in the result set
Intermediate & Advanced SEO | | EdelmanDigital0 -
How To Internationalize - Big Question
Hi all, Here is a big question. We have a long-established good content website with a .co.uk domain. The site is UK focussed. However, we are planning a new feature which will be UK and worldwide. So do we: 1. Keep it all on our .co.uk ? 2. Put the non-UK parts on a .com domain ? We don't have any content as such for a separate domain, and are not planning any. But, we are not sure if for example US users would be unimpressed with a UK domain. We could fudge it with "co.uk/us" etc. (Notice how we have not mentioned Google. Fed-up chasing big G the whole time. We just want to concentrate on our users and the service we provide to them. But G remains the elephant crapping in the corner of the room.) Also, we are asking this question before we let our developers and designers get to work. Basically we value Moz community opinions over and above theirs. Realise this is a big question, but you have big brains. Please chip in.
Intermediate & Advanced SEO | | dexm100 -
Effect of URL change on Website
Hello we are developers and we have just created a new webpage for a client of us. The problem is that we can not replace the old one by the new one, cause our client has developed over 15 satellite pages that calls directly to the code of the old page. If we completly remove the old page we will make those 15 pages go down. Those pages are working over domains specially register for SEO reasons. For example Main page is www.euroair.es Satellite page is www.aireacondicionadodaikin.com Satellite page has pretty good ranking for search term "aire acondicionado daikin" As I told you, we have a new page but we can not make the page work over root domain. So we thought we could make it work over www.euroair.es/es, and make a redirection 301 of homepage and another important inner pages. We chose "/es" folder because it seems like a language folder, but we are not very sure of the effects of pages working on that folder instead of working on root directory. What do you think? Is this matter important or doesn't? Thanks
Intermediate & Advanced SEO | | teconsite.com0