Satan, SEO and Subdomains – VOL V. – SEOfirot of the Tree of Life
The category tree forms the core of the site and flows through it with life-giving sap.
SEOfirot of the Tree of Life is the fifth part of the series of texts "Satan, SEO and Subdomains". In the previous installment of One To Rule MFI, we discussed moving from a mobile subdomain to a fully responsive site. With that change complete, we were just a little short of the finale.
Let's quickly review the four main types of Heureka subdomains:
- Basic and System (
info) ~20 subdomains.
- Mobile subdomain (
m) 1 subdomain.
- Category subdomains (
electronics) ~2500 subdomains.
- Brand subdomains (
apple) ~60,000 subdomains.
- Parametric subdomains (
xbox-360) ~1000 subdomains.
This text focuses on category subdomains, of which approximately 3500–4000 still exist as of the date of this text. See below to find out why the change has not yet taken place and when it will happen.
Trees of Life
The Tree of Life is one of the foundational archetypes of many mythologies and religions around the world.1 We can think, for example, of the Tree of Knowledge in the biblical Garden of Eden2, the Nordic Yggdrasil3, or the Kabbalistic Tree of Life depicting the ten sefirot.4 These sacred trees represent the concepts of knowledge, fertility, immortality, and the path to aspects of the divine.
(Kabbalistic Tree of Life. Image source: https://commons.wikimedia.org/wiki/File:Sefiroticky_strom.jpg).
The Webs also have their tree of life. We call it the tree of categories. It manifests in different forms and taxonomic5 information structures. Most often as navigation menus, breadcrumb navigation, XML, or JSON feeds.
The branches of this tree have a clearly defined form (URI6) or location (URL7). This should ideally be immutable. Branches can be moved within the hierarchy in a limited way. But it is very important that they retain their location identifiers. In other words, it is better not to change the URLs of important pages in order to maintain the stability of the site.
Myths and Legends of the Ancient Websites
Heureka was founded in 2007. From the beginning, it used various subdomains. Many still exist on the same URLs after more than fifteen years of operation. For example, mobilni-telefony.heureka.cz, notebooks.heureka.cz or digitalni-fotoaparaty.heureka.cz. In 2022, the number of category subdomains ranged between 3500–4000. The vast majority of them have a really long history.
As in previous cases, we struggled with a number of things with categories. First and foremost were the unexpected subdomains. In particular, they complicated our testing, evaluation, and development of the site. Covering new areas and creating more subcategories meant constantly creating new subdomains and exacerbating the associated problems. For example, we were forced to create duplicate products.
This is because product details on Heureka are directly linked to category URLs. Thus, a single product can exist on several URLs. For example, a classic paper book (
https://knihy.heureka.cz/konec-prokrastinace-jak-prestat-odkladat-a-zacit-zit-naplno-petr-ludwig/), an e‑book (
https://e-book-elektronicke-knihy.heureka.cz/konec-prokrastinace/), and an audiobook (
https://audioknihy.heureka.cz/konec-prokrastinace-petr-ludwig/) are three URLs for virtually the same product.8
The most problematic technicalities that plagued our categories were spider traps. This is because even in the categories there were errors with duplication of slashes
https://nike.heureka.cz/batohy/?o=4//// and filters could create variations.9
One of the last problems was the deployment of the mega menu10 which showed the fragility of the subdomain system. This is because Google views subdomains as separate sites.11 Each subdomain thus has its own ranking and link profile. After the deployment of the megamenu, the simplistic result was that weaker subdomains siphoned off the ranking of stronger ones, but logically were unable to give the same back. So the strong categories weakened and the weak ones basically didn't help themselves.
A similar effect would probably occur even if all categories were on the same (sub)domain. But the effect would not be as pronounced. Sharing links within a single (sub)domain would still dilute the rankings, but it would also make them more appropriately distributed. However, we are rather assuming this only a little, because we don't really know how exactly Google counts links between subdomains and what weight it actually gives to different types of links.
After some minor adjustments, the effect of megamenu deployment has been relatively stable. But the signal that we can't continue with subdomains was strong enough.
The Tree of Death
Not gonna lie. We didn't want to go through the trouble of redirecting the categories, even though it was necessary. All site changes need to be looked at in a business-like manner. The parametric sections and brand corners were pretty fun. Relatively low traffic, few conversions, and almost no backlinks. Potential mistakes here can hurt a bit, but it can be touched with relative ease.
Certainly change the URLs of all the categories, filters, and products built over the years, where the vast majority of traffic flows. That's not fun anymore. Few people would want to willingly expose the site they are in charge of to such a risk. Not to mention the problems and unsustainability of the system. The idea that the search engine would not react positively to this change was somewhat frightening.
Despite our fears, we planned and prepared for all the work. In the end, circumstances were decided for us. As part of the transition to One Platform12, we also started preparing a single category tree for sites in all Heureka Group countries, as well as a new unified URL.
This would mean that we would have to redirect categories twice in a short time. The first time on the current platform. And in 2–3 years again on a new URL to match the needs of the new platform. Taking the same risk several times unnecessarily didn't make sense and we scrapped the original plans.
Subdomains are not acceptable for One Platform and are no longer considered. Sites will be unified on clearly defined URLs. For example, products will be typed on URLs like
https://www.heureka.cz/p/apple-iphone-13-128gb/. Categories at
https://www.heureka.cz/c/mobilni-telefony-c1234/. The URLs will now have slugs and identifiers to allow easier processing and evaluation of data across all countries.13 So Heureka will eventually get rid of almost all subdomains.
It is uncertain when exactly this will happen. The gradual rollout of One Platform and the single tree to Heureka Group countries will take several months. Maybe years. It doesn't quite fit into our original subdomain SEO campaign anymore, so we're publishing this text purely as an explanation of the current state of play.14
The Dark Side
What can we expect after the re-platforming? Purely from an SEO perspective. Imagine a real tree. Let's pull it out of the ground. We cut the trunk, branches, and shoots. We rip off the leaves. Sort it into piles. And in the same place (top-level domain), glue the trunk (top-level category) and branches (subcategory) together in a plus or minus fashion. We replace the shoots (=filters). We leave the leaves (=product details) on the pile aside. Rooting the tree again and will it thrive?
Specifically, we're talking about moving about 3500 categories from subdomains to new addresses. Adjusting the logic of some filters. And allocating approx 29,000,000 products outside of categories to new URLs. And we're only talking about the Czech site so far. You have to add to that the millions of products and thousands of categories on sites in the other eight Heureka Group countries.
From a purely technical point of view, this change should not have been any different and exceptional compared to all previous activities, which turned out well. A bunch of redirects with relatively easily identifiable patterns. We can do this very well thanks to the Redirect Tool, which we develop internally.
But as we've already outlined. It's really an extreme hit to the absolute innermost structure of the site in many areas at the same time. On top of that, any historical redirect rules will still need to be modified to avoid long chaining of redirects or loops. And there will be some bridging period, similar to what we described in the previous article about the mobile version. So, temporarily, the new and old versions will run at the same time and nothing will be allowed to leak out. There's not much room for error.
Architects of Destiny
Feel like telling the fate of the site? Let's put the tarot cards and crystal balls aside for a while. Web engineering is, or at least should be, a pragmatic15
discipline backed by data and a strong technical foundation. For example, we create the structure (or information architecture) of a website based on keyword analysis, which is primarily a data discipline.
Most people think of "information architecture" as just a taxonomy of categories and menus. But it is a much broader term that encompasses, in general, all information presented to humans and search engines. From the aforementioned hierarchy of categories to the organization of content on pages and internal links, to structured data, metadata, status codes, robots.txt, sitemaps, canonicals, and more. Being an information architect of a website in any respect is quite a responsible job.
Some of the listed components can be changed and modified quite easily and without penalty. Changes to content or structured data are not particularly risky. In contrast, interfering with the established hierarchy of the site or, in our case, completely dismantling it, must be done with great deliberation.
The trouble is that in SEO we are mere architects. At best, even builders. However, the real architect of our destiny is ultimately the mighty Cyber Trinity, represented by Spider of Crawling, Database of (Re)Indexing, and Algorithms of Ranking.
(Holy Trinity. Original image source: https://commons.wikimedia.org/wiki/File:Shield-Trinity-Scutum-Fidei-English.svg)
How the search engines will react to such a massive change we have no idea, and it's not entirely in our hands. We are realistic and therefore have to write that some sort of drop will probably happen. But anything further or more accurate could unfortunately only be predicted from a crystal ball at most.
The category subdomains will be with us for a while and what the future holds no one knows. We do not underestimate the preparations and remain optimistic that it will eventually turn out well as it has so far. Keep your fingers crossed.
We're close to the end. In the next article, we'll do a quick recap and discuss whether subdomains are good or bad.
Series on SEO and subdomains
- Satan, SEO and Subdomains – VOL I. – Chaos Crawling
- Satan, SEO and Subdomains – VOL II. – Cancer SEOtherapy
- Satan, SEO and Subdomains – VOL III. – Controlled SEOcide
- Satan, SEO and Subdomains – VOL IV. – One To Rule MFI
- Satan, SEO and Subdomains – VOL V. – SEOfirot of the Tree of Life
- Satan, SEO and Subdomains – VOL VI. – Horsemen of the Apocalypse
Approach the text with caution. This article and the entire series are not intended as a guide. The texts do not contain any "universal" truths. Each site represents a unique system with different starting conditions. An individual approach and perfect knowledge of the specific site and the subject matter are required.
The article describes our website. We do not evaluate the general effectiveness of subdomains or directories. Nor do we recommend any specific solution. Again, this is a highly individual matter, influenced by many factors.
Strategies and detailed plans for some of the activities described here have been in the works for over a year. Everything has been discussed, tested, and validated many times. Please keep this in mind when you do similar activities yourself.
Some of the data presented may be inaccurate and purposefully distorted. Specific numbers such as organic traffic stats, revenue, conversions, and the like, we don't plan to leak out for obvious reasons. However, key information such as subdomain counts, URLs, and our practices are presented truthfully without embellishment.
The text may contain advanced concepts and models that are not entirely standard in SEO. The articles are therefore supplemented with footnotes with sources where everything is explained in detail.
More precisely product variants. ↩
More details in the article https://www.heurekadevs.com/satan-seo-and-subdomains-iii-seocide. ↩
Or isn't it? Find out more in the following article Horsemen of the SEOcalypse. ↩
OnePlatform (CZ only) https://www.heurekadevs.cz/menimeheureku-chystame-mezinarodni-oneplatform. ↩
These URLs are just a sample, they may not be the final form. ↩
Further information might be published after the completion and evaluation of the re-platforming. But we cannot promise anything. ↩