Keyword clustering and Text analysis
Check out a featured guide on the Keyword clustering and Text analysis tools. Keyword clustering and Text analysis consist of two tools:
1. Keyword clustering - a mass grouping of uploaded keywords based on their semantic similarity.
2. Text analysis - URL analysis and recommendations on the page SEO (under development currently).
Keyword clustering is the process of grouping a set of keywords in such a way that keywords in the same group (called a cluster) are more similar to each other than to those in other groups. The level of similarity/differentiation depends on the set parameters.
Why do you need keyword clustering?
- Grouping of semantically related keywords;
- Reliable automatic analysis of a set of keywords;
- Collecting the right keywords for specific pages;
- Creating a site’s SEO architecture;
- Searching for keywords that are in no way related to the topics of clustered keywords.
The most fundamental drawback of existing keyword clustering tools is that the resulting clusters may either contain keywords without a strong semantic similarity, or the analyzed
Unlike many competitors’ solutions, Serpstat employs intelligent hierarchical clustering where clusters are combined in a supercluster. This being said, no preliminary data collecting like keyword search volumes required, you only need to upload a list of keywords and choose the region and clustering parameters. Serpstat clustering tool doesn’t set a cluster center (a keyword with the highest search volume which is compared with other keywords to detect the number of matching URLs in SERP) — Serpstat is looking for connections among all clustered keywords.
Let’s look in detail at the main settings of the tool.
In fact, there are only two of them: Weak/Strong and Soft/Hard.
Weak parameter tells the system that in order to be combined into a cluster, the keywords must have at least 3 common URLs in Top 30 search results for a keyword, while Strong sets 7 common URLs as a condition for keywords merging into a single cluster.
The next clustering parameter choice is Soft/Hard.
Soft tells the system that a cluster can be created if at least one pair of keywords
Hard requires all keywords in a cluster to have 3 or 7 common URLs in top 30 search results for a keyword (the requirement for the number of common keywords is defined on the previous step where you selected Weak or Strong clustering). The resulting clusters contain synonymous keywords with a high semantic similarity. At the same time, this clustering method produces lots of clusters as the keywords can be merged into a cluster only if they are closely related.
Strength shows how closely a keyword is semantically related to the cluster’s topic on a scale from 0 to 1.
Upon clustering completion, a portion of the initial set of keywords can be seen in the Unsorted directory. These are objects that haven’t got to any cluster. One reason for this can be that the keywords have no semantic similarity to the topic of the analyzed keyword set and should be removed from the dataset. An alternative solution is to create separate pages for these keywords or move them to one of the created clusters if you believe they belong there.
Which clustering method is right for you?
The decision should be based on the semantic similarity of the objects from your dataset.
If the keywords are initially closely related, for example, sneakers of different brands, you may want to choose Strong+Hard or Strong+Soft so that only the closest synonyms are combined into a cluster. You’ll get lots of clusters to use for separate pages or specific categories.
In the case of various products and services, for example, a keyword collection for multi-product store or medical center with a full range of health-care services, it’s worth selecting Weak+Soft. The choice of Strong+Soft will produce more clusters and a possibility to get more topic-specific clusters.
Meta-top is a list of major competitors in SERP for keywords from a cluster. The higher a page’s rank in the meta-top, the more relevant it is to the cluster’s topic.
Setting up a clustering project
Go to the Tools section and open Keyword clustering and Text analysis.
Click Create a project.
Name your project and input a domain name (optional).
Input a list of keywords or upload them from a file.
Choose a search engine and region.
Finally, choose Linkage strength, Type of grouping and click Finish.
The resulting clusters will look like this:
Where 3 is a cluster, 2 — supercluster, and 1 — protocluster.
Supercluster is a set of clusters. It combines keywords with a high semantic similarity score, but slightly less similar than keywords in a cluster.
Protocluster is a set of superclusters. Generally, protocluster is made up of superclusters related to a specific category of objects. For example, if you’re developing SEO architecture for a multi-product store, then one protocluster may contain superclusters associated with different types of refrigerators, and the other — microwave ovens of different brands. Protoclusters are designed to streamline the work with superclusters.
Here's the breakdown of the above figure:
1. Every keyword from a cluster has its connection strength. It provides a hint of how close that keyword is to the cluster's topic on a scale from 0 to 1.
2. Homogeneity shows the semantic consistency of a cluster of a scale from 0 to 1.
3. If you specified a domain while creating a project, we'll look at your website's pages and display the page which is the closest to the cluster's topic in the URL field. If you
Each cluster has a drop-down menu:
1. Add keywords — opens a window where you can add some keywords to the existing cluster.
3. Search keywords — opens a search box where you can look for specific keywords in the cluster.
4. Delete keywords — deletes checked keywords from the cluster.
5. Delete group — deletes the cluster from your project.
Serpstat Text analysis (hereinafter TA) is designed to provide recommendations on how to improve your on-page SEO — what changes or amendments you need to make on your page to better optimize it for keywords from a cluster or what keywords you should insert into page contents if you’re doing a page SEO from scratch.
TA analyzes the text on the landing page (if a URL has been specified), the list of keywords from a cluster and a set of pages from the Top 15 search results for keywords from the list. We assume that the search engine considers the text on those pages relevant to the researched search queries if the pages are displayed in Top 15 search results.
If a target URL is specified, the TA tool analyses text content of your page and suggests lexical items to be added to the page. The suggestions are based on the text content of top pages for keywords from the cluster. If you didn’t specify the URL, recommendations are made upon researching the largest group of related competitors - in this case, Serpstat can’t know for sure that a proper group of competitor URLs has been selected
Choose a keyword cluster you’d like to analyze, input a URL and click Start analysis.
Upon completion, click See results.
How the credits for Clustering are spent:
1 keyword = 5 credits.
Number of keywords * 5 = number of spent credits.
For example: When you add 20 keywords to the project, you spend 100 credits:
20 * 5 = 100 credits.
Please note: Credits for Clustering, Text analysis, Domains batch analysis for domains and Keywords batch analysis are general.