Get access to 30+ marketing and SEO tools. analyze competitors, keywords, and backlinks for free..
Sign in Sign Up

We use cookies to make Serpstat better. By clicking "Accept cookies", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. Learn more

Keyword clustering and Text analysis

Check out a featured guide on the Keyword clustering and Text analysis tools. Keyword clustering and Text analysis consist of two tools:
1. Keyword clustering - a mass grouping of uploaded keywords based on their semantic similarity.
2. Text analysis - URL analysis and recommendations on the page SEO (under development currently).

Keyword clustering 16261788669974

Keyword clustering is the process of grouping a set of keywords in such a way that keywords in the same group (called a cluster) are more similar to each other than to those in other groups. The level of similarity/differentiation depends on the set parameters.  

Why do you need keyword clustering?
- Grouping of semantically related keywords;
- Reliable automatic analysis of a set of keywords;
- Collecting the right keywords for specific pages;
- Creating a site’s SEO architecture;
- Searching for keywords that are in no way related to the topics of clustered keywords.
The most fundamental drawback of existing keyword clustering tools is that the resulting clusters may either contain keywords without a strong semantic similarity, or the analyzed data set produces too many clusters that could have been merged into larger clusters (the first issue arises while using Soft clustering, the second is caused by Hard method). Also, both these clustering types share a common drawback — scattering of clusters with a similar topic.
Unlike many competitors’ solutions, Serpstat employs intelligent hierarchical clustering where clusters are combined in a supercluster. This being said, no preliminary data collecting like keyword search volumes required, you only need to upload a list of keywords and choose the region and clustering parameters. Serpstat clustering tool doesn’t set a cluster center (a keyword with the highest search volume which is compared with other keywords to detect the number of matching URLs in SERP) — Serpstat is looking for connections among all clustered keywords.

Let’s look in detail at the main settings of the tool.
In fact, there are only two of them: Weak/Strong and Soft/Hard.
Weak parameter tells the system that in order to be combined into a cluster, the keywords must have at least 3 common URLs in Top 30 search results for a keyword, while Strong sets 7 common URLs as a condition for keywords merging into a single cluster.
The next clustering parameter choice is Soft/Hard.
Soft tells the system that a cluster can be created if at least one pair of keywords has 3 or 7 common URLs in Top 30 search results (depending on the previous Weak/Strong choice).

Keyword clustering 16261788669974

Hard requires all keywords in a cluster to have 3 or 7 common URLs in top 30 search results for a keyword (the requirement for the number of common keywords is defined on the previous step where you selected Weak or Strong clustering). The resulting clusters contain synonymous keywords with a high semantic similarity. At the same time, this clustering method produces lots of clusters as the keywords can be merged into a cluster only if they are closely related.

Keyword clustering 16261788669974

Strength shows how closely a keyword is semantically related to the cluster’s topic on a scale from 0 to 1.
Upon clustering completion, a portion of the initial set of keywords can be seen in the Unsorted directory. These are objects that haven’t got to any cluster. One reason for this can be that the keywords have no semantic similarity to the topic of the analyzed keyword set and should be removed from the dataset. An alternative solution is to create separate pages for these keywords or move them to one of the created clusters if you believe they belong there.

Which clustering method is right for you?
The decision should be based on the semantic similarity of the objects from your dataset.
If the keywords are initially closely related, for example, sneakers of different brands, you may want to choose Strong+Hard or Strong+Soft so that only the closest synonyms are combined into a cluster. You’ll get lots of clusters to use for separate pages or specific categories.
In the case of various products and services, for example, a keyword collection for multi-product store or medical center with a full range of health-care services, it’s worth selecting Weak+Soft. The choice of Strong+Soft will produce more clusters and a possibility to get more topic-specific clusters.

Meta-top is a list of major competitors in SERP for keywords from a cluster. The higher a page’s rank in the meta-top, the more relevant it is to the cluster’s topic.

Setting up a clustering project
Go to the Tools section and open Keyword clustering and Text analysis

Keyword clustering 16261788669974

Click Create a project.
Name your project and input a domain name (optional).

Keyword clustering 16261788669974

Input a list of keywords or upload them from a file.

Keyword clustering 16261788669974

Choose a search engine and region.

Keyword clustering 16261788669974

Finally, choose Linkage strength, Type of grouping and click Finish.

Keyword clustering 16261788669975

The resulting clusters will look like this:

Keyword clustering 16261788669975

Where 3 is a cluster, 2 — supercluster, and 1 — protocluster.
Supercluster is a set of clusters. It combines keywords with a high semantic similarity score, but slightly less similar than keywords in a cluster.
Protocluster is a set of superclusters. Generally, protocluster is made up of superclusters related to a specific category of objects. For example, if you’re developing SEO architecture for a multi-product store, then one protocluster may contain superclusters associated with different types of refrigerators, and the other — microwave ovens of different brands. Protoclusters are designed to streamline the work with superclusters.

Keyword clustering 16261788669975

Here's the breakdown of the above figure:
1. Every keyword from a cluster has its connection strength. It provides a hint of how close that keyword is to the cluster's topic on a scale from 0 to 1. 
2. Homogeneity shows the semantic consistency of a cluster of a scale from 0 to 1.
3. If you specified a domain while creating a project, we'll look at your website's pages and display the page which is the closest to the cluster's topic in the URL field. If you didn't  input a domain, you can add a URL manually by clicking Add URLYou can launch Text analysis for any keyword cluster.
Each cluster has a drop-down menu:

Keyword clustering 16261788669975

1. Add keywords — opens a window where you can add some keywords to the existing cluster.
2. Toggle metatop — opens a list of direct competitors for keywords from the cluster. The higher a page is listed, the more relevant it's to the cluster's topic.
3. Search keywords — opens a search box where you can look for specific keywords in the cluster.
4. Delete keywords — deletes checked keywords from the cluster.
5. Delete group — deletes the cluster from your project.

Text analysis
Serpstat Text analysis (hereinafter TA) is designed to provide recommendations on how to improve your on-page SEO — what changes or amendments you need to make on your page to better optimize it for keywords from a cluster or what keywords you should insert into page contents if you’re doing a page SEO from scratch.

TA analyzes the text on the landing page (if a URL has been specified), the list of keywords from a cluster and a set of pages from the Top 15 search results for keywords from the list. We assume that the search engine considers the text on those pages relevant to the researched search queries if the pages are displayed in Top 15 search results.
If a target URL is specified, the TA tool analyses text content of your page and suggests lexical items to be added to the page. The suggestions are based on the text content of top pages for keywords from the cluster. If you didn’t specify the URL, recommendations are made upon researching the largest group of related competitors - in this case, Serpstat can’t know for sure that a proper group of competitor URLs has been selected
Choose a keyword cluster you’d like to analyze, input a URL and click Start analysis.

Keyword clustering 16261788669975

Upon completion, click See results.

Keyword clustering 16261788669975

Spending credits

How the credits for Clustering are spent:
1 keyword = 5 credits.
Number of keywords * 5 = number of spent credits.

For example: When you add 20 keywords to the project, you spend 100 credits:
20 * 5 = 100 credits.

Please note: Credits for Clustering, Text analysis, Domains batch analysis for domains and Keywords batch analysis are general.

Share this article with your friends

Sign In Free Sign Up

You’ve reached your query limit.

Or email
Forgot password?
Or email
Back To Login

Don’t worry! Just fill in your email and we’ll send over your password.

Are you sure?


To complete your registration you need to enter your phone number


We sent confirmation code to your phone number

Your phone Resend code Queries left

Something went wrong.

Contact our support team
Or confirm the registration using the Telegram bot Follow this link
Please pick the project to work on

Introducing Serpstat

Find out about the main features of the service in a convenient way for you!

Please send a request, and our specialist will offer you education options: a personal demonstration, a trial period, or materials for self-study and increasing expertise — everything for a comfortable start to work with Serpstat.




We are glad of your comment
I agree to Serpstat`s Privacy Policy.
Upgrade your plan

Upgrade your plan

Export is not available for your account. Please upgrade to Lite or higher to get access to the tool. Learn more

Sign Up Free

Thank you, we have saved your new mailing settings.

View Editing


You have run out of limits

You have reached the limit for the number of created projects. You cannot create new projects unless you increase the limits or delete existing projects.

I want more limits
Open support chat