Most Wanted: We Launched Our Own Link Index With New Architecture
Let's see what its advantages are, and analyze the theory of information search to understand what we created.
Backlinks index and its benefits
5 advantages of our index
You don't need to wait until the changes become available in the interface and switch between different indexes to see the current picture on your sites for all the time. Data will be updated as our robots cross the entire Internet and stored in one index.
How to build a link index
Such a scheme is a visual representation, but it makes no sense to store link connections between pages in the form of a scheme. The same information can be presented in a more compact form, from which it will be possible to restore it at any time.
Let's convert the scheme into a table: in the rows and columns we indicate all the pages of our site; at the intersection of the row and column we will set 1 if there is a link to the page in the column from the page in the row, and 0 if there is no such link.
Suppose we want to store the structure of our site on a hard drive. The advantage of such a table over the visual representation is that it takes up much less storage space. Moreover, you are able not to store zeros.
There are usually much more links between pages of one site than links from pages from one site to pages of another. Our contingency table in some places will be densely filled with units, but most of it will be empty. We could fold the table row by row to the list of pages that this link goes to.
On the example of our table, it would look like this:
Main page → Category 1, Category 2, Page 2
Category 1 → Main page, Page 1, Page 2
Category 2 → Main page
Page 1 → Main page, Page 2
Page 2 → Main page, Page 1
We would get the so-called direct index. But let's look at this visual and try to answer the question that excites many SEO-specialists: which pages link to Page 2? We will have to go through all the lists and see if Page 2 is among them. This is easy to do when there are five such lists. But there are billions of pages on the Internet and checking so many lists turns into a very time-consuming task.
To get an answer to the question that worries us so much, we can fold the contingency table by columns. As a result, we get lists of pages that link to the page:
Main page ← Category 1, Category 2, Page 1, Page 2
Category 1 ← Main page
Category 2 ← Main page
Page 1 ← Category 1, Page 2
Page 2 ← Main page, Category 1, Page 1
Now, to find the answer, it's enough for us to find among all the lists only the list we need for the Page 2, and we don't need to go through the contents of each list. So we got the backlink index. It is in this form that Serpstat stores link data to your site. This is a very simplified model, but the basic principles in it are correct.
Serpstat link index
To keep the index up to date in such a dynamic environment, our Serpstatbot/1.0 bot (advanced backlink tracking bot; firstname.lastname@example.org) follows the rules in robots.txt and other basic rules. More details here.
We have many plans to finalize both the index itself and the interface. Therefore, we have a big wish - give feedback on our new index. We can personally communicate if you have any comments. This can affect our development priorities and speed up the release of the functionality that you need when analyzing links.
Speed up your search marketing growth with Serpstat!
Keyword and backlink opportunities, competitors' online strategy, daily rankings and SEO-related issues.
A pack of tools for reducing your time on SEO tasks.
Cases, life hacks, researches, and useful articles
Don’t you have time to follow the news? No worries! Our editor will choose articles that will definitely help you with your work. Join our cozy community :)