What is the semantic core? How to create an effective semantic core. How to correctly compose a semantic core? How to compose a semantic core yourself

Often, novice webmasters, faced with the need to create a semantic core, do not know where to start. Although there is nothing complicated in this process. Simply put, you need to collect a list of key phrases that Internet users use to search for information on your website.

The more complete and accurate it is, the easier it is for a copywriter to write a good text, and for you to get high positions in searches for the right queries. How to correctly compose large and high-quality semantic cores and what to do with them next so that the site reaches the top and collects a lot of traffic will be discussed in this material.

The semantic core is a set of key phrases, ungrouped by meaning, where each group reflects one need or desire of the user (intent). That is, what a person thinks about when typing his query into the search bar.

The entire process of creating a kernel can be represented in 4 steps:

  1. We are faced with a task or problem;
  2. We formulate in our heads how we can find its solution through a search;
  3. We enter a request into Yandex or Google. Besides us, other people do the same;
  4. The most frequent variants of requests end up in analytics services and become key phrases that we collect and group according to needs. As a result of all these manipulations, a semantic core is obtained.

Is it necessary to select key phrases or can you do without it?

Previously, semantics was compiled in order to find the most frequent keywords on a topic, fit them into the text and get good visibility for them in the search. Over the past 5 years, search engines have been striving to move to a model where the relevance of a document to a query will be assessed not by the number of words and the variety of their variations in the text, but by assessing the disclosure of intent.

For Google, this began in 2013 with the Kolibri algorithm, for Yandex in 2016 and 2017 with Palekh and Korolev technologies, respectively.

Texts written without syntax will not be able to fully cover the topic, which means it will not be possible to compete with the TOP for high-frequency and mid-frequency queries. It makes no sense to rely on low-frequency queries - there is too little traffic for them.

If you want to successfully promote yourself or your product on the Internet in the future, you need to learn how to create the right semantics that fully reveal the needs of users.

Classification of search queries

Let's look at 3 types of parameters by which keywords are evaluated.

By frequency:

  • High Frequency (HF) - phrases that define a topic. Consist of 1-2 words. On average, the number of search queries starts from 1000-3000 per month and can reach hundreds of thousands of impressions, depending on the topic. Most often, the main pages of websites are designed for them.
  • Mid-frequency (MF) – separate directions in the topic. Mostly contain 2-3 words. With an exact frequency of 500 to 1000. Usually categories for a commercial site or topics for large information articles.
  • Low frequency (LF) – queries related to the search for a specific answer to a question. As a rule, from 3-4 words. This could be a product card or the topic of an article. On average, searches range from 50 to 500 people per month.
  • When analyzing metrics or statistics counter data, you can come across another type - micro low-frequency keys. These are phrases that are often asked once during a search. There is no point in sharpening the page for them. It is enough to be in the top for low frequencies, which includes them.


By competitiveness:

  • Highly competitive (HC);
  • Medium-concrete (SC);
  • Low competitive (NC);

According to need:

  • Navigational. Express the user’s desire to find a specific Internet resource or information on it;
  • Informational. Characterized by the need to obtain information as a response to a request;
  • Transactional. Directly related to the desire to make a purchase;
  • Vague or general. Those for which it is difficult to accurately determine the intent.
  • Geo-dependent and geo-independent. Reflect the need to search for information or complete a transaction in your city or without regional reference.


Depending on the type of site, you can give the following recommendations when selecting key phrases for the semantic core.

  1. Information resource. The main emphasis should be on finding topics for articles in the form of mid-range and low-frequency queries with low competition. It is recommended to cover the topic broadly and deeply, sharpening the page for a large number of low-frequency keys.
  2. Online store or commercial site. We collect HF, MF and LF, segmenting as clearly as possible so that all phrases are transactional and belong to the same cluster. We focus on finding well-converting low frequency NC keywords.

How to correctly compose a large semantic core - step-by-step instructions

We moved on to the main part of the article, where I will sequentially analyze the main stages that need to be completed to build the core of the future site.
To make the process clearer, all steps are given with examples.

Search for basic phrases

Working with the SEO core begins with selecting a primary list of basic words and phrases (VPs) that best characterize the topic and are used in a broad sense. They are also called markers.

These can be names of directions, types of products, popular queries from the topic. As a rule, they consist of 1-2 words and have tens and sometimes hundreds of thousands of impressions per month. It’s better not to use very wide keywords, so as not to drown in negative keywords at the expansion stage.

The most convenient way to select marker phrases is using . By entering a query into it, in the left column we see the phrases that it contains, in the right – similar queries from which you can often find suitable ones for expanding the topic. The service also shows the basic frequency of the phrase, that is, how many times it was asked per month in all word forms and with the addition of any words to it.

In itself, this frequency is of little interest, so to get more accurate values ​​you need to use operators. Let's figure out what it is and what it is needed for.

Yandex Wordstat operators:

1) “…” – quotation marks. A query in quotation marks allows you to track how many times a phrase was searched in Yandex with all its word forms, but without adding other words (tails).

2) ! - Exclamation point. Using it before each word in the query, we record its form and get the number of impressions in the search for a key phrase only in the specified word form, but with a tail.

3) “!... !... !...” - quotation marks and an exclamation mark before each word. The most important operator for the optimizer. It allows you to understand how many times a keyword is requested per month strictly for a given phrase, as it is written, without adding any words.

4) +. Yandex Wordstat does not take into account prepositions and pronouns when making a request. If you need him to show them, put a plus sign in front of them.

5) -. The second most important operator. With its help, words that do not fit are quickly eliminated. To use it, after the analyzed phrase we put a minus sign and a stop word. If there are several of them, repeat the procedure.

6) (…|…). If you need to get data from Yandex Wordstat for several phrases at the same time, enclose them in brackets and separate them with a forward slash. In practice, the method is rarely used.

For the convenience of working with the service, I recommend installing a special browser extension “Wordstat Assistant”. Installed on Mozilla, Google Chrome, Ya.Browser and allows you to copy phrases and their frequencies with one click of the “+” or “Add all” icon.


Let's say we decide to make our blog using SEO. Let’s choose 7 basic phrases for it:

  • semantic core;
  • optimization;
  • copywriting;
  • promotion;
  • monetization;
  • Direct

Search for synonyms

When formulating a query to search engines, users can use words that are close in meaning, but different in spelling.

For example, “car” and “machine”.

It is important to find as many synonyms for the main words as possible in order to increase the coverage of the future semantic core. If this is not done, then during parsing we will miss a whole layer of key phrases that reveal the needs of users.

What we use:

  • Brainstorm;
  • Right column of Yandex Wordstat;
  • Queries typed in Cyrillic;
  • Special terms, abbreviations, slang expressions from the topic;
  • Yandex and Google blocks - search together with the “query name”;
  • Snippets of competitors.

As a result of all actions for the selected topic, we get the following list of phrases:


Basic Query Expansion

Let's parse these keywords to identify the basic needs of people in this area.
The most convenient way to do this is in the Key Collector program, but if you don’t mind paying 1,800 rubles for a license, use its free analogue - Slovoeb.

In terms of functionality, it is of course weaker, but it is suitable for small projects.
If you don’t want to delve into the operation of programs, you can use the Just-Magic and Rush Analytics service. But it’s still better to spend a little time and understand the software.

I will show the principle of operation in Key Collector, but if you work with Sloboeb, then everything will also be clear. The program interface is similar.

Procedure:

1) Add a list of basic phrases to the program and measure the basic and exact frequencies based on them. If we are planning promotion in a specific region, we indicate the regionality. For informational sites, this is most often not necessary.


2) Let's parse the left column of Yandex Wordstat using the added words to get all the queries from our topic.


3) As a result, we got 3374 phrases. Let's take the exact frequency from them, as in point 1.


4) Let’s check if there are any keys with zero base frequency in the list.


If there is, delete it and move on to the next step.

Negative words

Many people neglect the procedure for collecting negative keywords, replacing it with deleting phrases that are not suitable. But later you will realize that it is convenient and really saves time.

Open the Data -> Analysis tab in Key Collector. Select the type of grouping by individual words and scroll through the list of keys. If we see a phrase that does not fit, click the blue icon and add the word instead with all its word forms to the stop words.


In Slovoeb, working with stop words is implemented in a more simplified version, but you can also create your own list of phrases that are not suitable and apply them to the list.

Don’t forget to use sorting by Base Frequency and number of phrases. This option helps you quickly reduce the list of initial phrases or weed out rarely occurring ones.


After we have compiled a list of stop words, we apply them to our project and move on to collecting search tips.

Parsing hints

When you enter a query into Yandex or Google, search engines offer their own options for continuing it from the most popular phrases that Internet users type in. These keywords are called search suggestions.

Many of them do not fall into Wordstat, so when building a semantic one, it is necessary to collect such queries.

Kay Collector, by default parses them with a search of endings, Cyrillic and Latin alphabet and with a space after each phrase. If you are ready to sacrifice quantity in order to significantly speed up the process, check the box “Collect only the TOP hints without brute force and a space after the phrase.”


Often among search suggestions you can find phrases with good frequency and competition tens of times lower than in Wordstat, so in narrow niches I recommend collecting as many words as possible.

The time for parsing hints directly depends on the number of simultaneous calls to search engine servers. Maximum Kay Collector supports 50-thread operation.
But in order to parse requests in this mode, you will need the same number of proxies and Yandex accounts.

For our project, after collecting tips, we got 29,595 unique phrases. In terms of time, the entire process took a little more than 2 hours on 10 threads. That is, if there are 50 of them, we’ll do it in 25 minutes.


Determination of base and exact frequencies for all phrases

For further work, it is important to determine the basic and exact frequency and eliminate all zeros. We leave requests with a small number of impressions if they are targeted.
This will help you better understand the intent and create a more complete article structure than is in the top.

In order to remove the frequency, we first filter out all unnecessary things:

  • repetitions of words
  • keys with other symbols;
  • duplicate phrases (via the “Implicit Duplicates Analysis” tool)


For the remaining phrases, we will determine the exact and base frequency.

A) for phrases up to 7 words:

  • Select through the filter “Phrase consists of no more than 7 words”
  • Open the “Collect from Yandex.Direct” window by clicking on the “D” icon;
  • If necessary, indicate the region;
  • Select the guaranteed impressions mode;
  • Set the collection period to 1 month and check the boxes for the required frequency types;
  • Click “Get data”.


b) for phrases of 8 words or more:

  • Set the filter for the “Phrase” column – “consists of at least 8 words”;
  • If you need to promote in a specific city, indicate the region below;
  • Click on the magnifying glass and select “Collect all types of frequencies.”


Cleaning keywords from garbage

After we have received information about the number of impressions for our keys, we can begin to filter out those that are not suitable.

Let's look at the procedure step by step:

1. Go to “Group Analysis” of Key Collector and sort the keys by the number of words used. The task is to find non-target and frequent ones and add them to the list of stop words.
We do everything the same as in the “Minus words” paragraph.


2. We apply all the found stop words to the list of our phrases and go through it so as not to lose target queries. After checking, click “Delete Marked Phrases”.


3. We filter out dummy phrases that are rarely used in exact occurrences, but have a high base frequency. To do this, in the settings of the Key Collector program, in the “KEY&SERP” item, insert the calculation formula: KEY 1 = (YandexWordstatBaseFreq) / (YandexWordstatQuotePointFreq) and save the changes.


4. We calculate KEY 1 and delete those phrases for which this parameter is 100 or more.


The remaining keys need to be grouped by landing pages.

Clustering

The distribution of queries into groups begins with clustering phrases by top using the free program “Majento Clusterer”. I recommend a paid analogue with wider functionality and faster operating speed - KeyAssort, but the free one is quite enough for a small kernel. The only caveat is that to work in any of them you will need to buy XML limits. Average price - 5 rubles. for 1000 requests. That is, processing an average core for 20-30 thousand keys will cost 100-150 rubles. See the screenshot below for the address of the service you use.


The essence of clustering keys using this method is to combine into groups those phrases that have Yandex Top 10:

  • shared URLs with each other (Hard)
  • with the most frequent request in the group (Soft).

Depending on the number of such matches for different sites, clustering thresholds are distinguished: 2, 3, 4 ... 10.

The advantage of this method is the grouping of phrases according to people’s needs, and not just by synonymous connections. This allows you to immediately understand which keywords can be used on one landing page.

Suitable for information specialists:

  • Soft with a threshold of 3-4 and then cleaning by hand;
  • Hard on 3, and then combining clusters according to the meaning.

Online stores and commercial sites, as a rule, are promoted according to Hard with a clustering threshold of 3. The topic is voluminous, so I will discuss it later in a separate article.

For our project, after grouping using the Hard method on 3, we got 317 groups.


Competition Check

There is no point in promoting for highly competitive queries. It’s difficult to get to the top, and without it there will be no traffic to the article. To understand which topics are profitable to write on, we use the following method:

We focus on the exact frequency of the group of phrases under which the article is written and the competition for Mutagen. For informational sites, I recommend taking on topics that have a total exact frequency of 300 or more, and a competitiveness coefficient of 1 to 12 inclusive.

In commercial topics, focus on the marginality of a product or service and how competitors in the top 10 are doing. Even 5-10 targeted requests per month may be a reason to make a separate page for it.

How to check competition on a request:

a) manually, by entering the appropriate phrase in the service itself or through mass tasks;


b) in batch mode through the Key Collector program.


Topic selection and grouping

Let's consider each of the resulting groups for our project after clustering and select topics for the site.
Majento, unlike Key Assort, does not allow you to download data on the number of impressions for each phrase, so you will have to additionally obtain them through Key Collector.

Instructions:

1) Upload all groups from Majento in CSV format;
2) Concatenate phrases in Excel using the “group:key” mask;
3) Load the resulting list into the Key Collector. In the settings, be sure to check the “Group:Key” import mode and not monitor the presence of phrases in other groups;


4) We remove the basic and exact frequency for keywords from the newly created groups. (If you use Key Assort, you don't need to do this. The program allows you to work with additional columns)
5) We are looking for clusters with unique intent, containing at least 3 phrases and the number of impressions for all queries totaling more than 300. Next, we check the 3-4 most frequent of them for competitiveness according to Mutagen. If among these phrases there are keys with competition less than 12, we take them to work;

6) We look through the remaining groups. If there are phrases that are close in meaning and worth considering on one page, we combine them. For groups containing new meanings, we look at the prospects for the total frequency of phrases; if it is less than 150 per month, then we postpone it until we go through the entire core. It may be possible to combine them with another cluster and get 300 exact impressions - this is the minimum from which it is worth taking the article into work. To speed up manual grouping, use auxiliary tools: quick filter and frequency dictionary. They will help you quickly find suitable phrases from other clusters;


Attention!!! How do you know that clusters can be merged? We take 2 frequency keys from those that we selected in step 5 for the landing page and 1 request from the new group.
We add them to Arsenkin’s “Upload Top 10” tool, specify the desired region if necessary. Next, we look at the number of intersections by color for the 3rd phrase with the rest. We combine groups if there are 3 or more of them. If there are no matches or one, you cannot combine - different intents, in the case of 2 intersections, look at the output by hand and use logic.

7) After grouping the keys, we get a list of promising topics for articles and semantics for them.


Removing requests for another content type

When compiling a semantic core, it is important to understand that commercial queries are not needed for blogs and information sites. Just like online stores do not need information.

We go through each group and clean out everything unnecessary; if we cannot accurately determine the intent of the request, we compare the results or use the following tools:

  • Commercialization check from Pixel Tools (free, but with a daily check limit);
  • Just-Magic service, clustering with a checkmark to check the commerciality of the request (paid, cost depends on the tariff)

After this we move on to the last stage.

Phrases optimization

We optimize the semantic core so that it is convenient for SEO specialists and copywriters to work with it in the future. To do this, we will leave in each group key phrases that reflect the needs of people as fully as possible and contain as many synonyms for the main phrases as possible.

Algorithm of actions:

  • Let's sort the keywords in Excel or Key Collector alphabetically from A to Z;
  • Let's choose those that reveal the topic from different angles and in different words. All other things being equal, we leave phrases with a higher exact frequency or which have a lower key 1 indicator (the ratio of the base frequency to the exact frequency);
  • We delete keywords with less than 7 impressions per month, which do not carry new meanings and do not contain unique synonyms.

An example of what a well-composed semantic core looks like:

I marked in red phrases that do not match the intent. If you neglect my recommendations for manual grouping and do not check compatibility, it will turn out that the page will be optimized for incompatible key phrases and you will no longer see high positions for promoted queries.

Final checklist

  1. We select the main high-frequency queries that set the topic;
  2. We look for synonyms for them using the left and right columns of Wordstat, competitor sites and their snippets;
  3. We expand the received queries by parsing the left column of Wordstat;
  4. We prepare a list of stop words and apply them to the resulting phrases;
  5. Parsing Yandex and Google tips;
  6. We remove the base and precise frequencies;
  7. Expanding the list of negative keywords. We clean from garbage and requests for pacifiers
  8. We do clustering using Majento or KeyAssort. For informational sites in Soft mode, the threshold is 3-4. For commercial Internet resources using the Hard method with a threshold of 3.
  9. We import the data into Key Collector and determine the competition of 3-4 phrases for each cluster with a unique intent;
  10. We select topics and decide on landing pages for queries based on an assessment of the total number of accurate impressions for all phrases from one cluster (from 300 for information specialists) and competition for the most frequent of them according to Mutagen (up to 12).
  11. For each suitable page, we look for other clusters with similar user needs. If we can consider them on one page, we combine them. When the need is not clear or there is a suspicion that there should be a different type of content or page as an answer to it, we check the search results or through the Pixel Tools or Just-Magic tools. For content sites, the core should consist of information requests; for commercial sites, transactional ones. We remove the excess.
  12. We sort the keys in each group alphabetically and leave those that describe the topic from different angles and in different words. All other things being equal, priority is given to those queries that have a lower ratio of base frequency to exact frequency and a higher number of precise impressions per month.

What to do with the SEO core after its creation

We compiled a list of keys, gave them to the author, and he wrote an excellent article in full, revealing all the meanings. Eh, I’m daydreaming... A sensible text will only work if the copywriter clearly understands what you want from him and how to test himself.

Let’s look at 4 components, having worked them out well, you are guaranteed to receive a lot of targeted traffic to the article:

Good structure. We analyze the queries selected for the landing page and identify what needs people have in this topic. Next, we write an outline for the article that fully answers them. The task is to make sure that when people visit the site, they receive a voluminous and comprehensive answer regarding the semantics that you have compiled. This will give good behavioral and high relevance to the intent. After you have made a plan, look at your competitors' websites by typing the main promoted query into the search. You need to do it exactly in this order. That is, first we do it ourselves, then we look at what others have and, if necessary, we modify it.

Optimization for keys. We sharpen the article itself for 1-2 of the most frequent keys with competition for Mutagen up to 12. Another 2-3 mid-frequency phrases can be used as headings, but in a diluted form, that is, inserting into them additional words not related to the topic, using synonyms and word forms . We focus on low-frequency phrases from which a unique part is pulled out - the tail - and evenly introduced into the text. The search engines themselves will find and glue everything together.

Synonyms for basic queries. We write them out separately from our semantic core and set the task for the copywriter to use them evenly throughout the text. This will help reduce the density of our main words and at the same time the text will be optimized enough to get to the top.

Thematic-setting phrases. LSIs themselves do not promote the page, but their presence indicates that the written text most likely belongs to the “pen” of an expert, and this is already a plus for the quality of the content. To search for thematic phrases, we use the “Technical Specifications for a Copywriter” tool from Pixel Tools.


An alternative method for selecting key phrases using competitor analysis services

There is a quick approach to creating a semantic core that is suitable for both beginners and experienced users. The essence of the method is that we initially select keywords not for the entire site or category, but specifically for an article or landing page.

It can be implemented in 2 ways, which differ in how we choose topics for the page and how deeply we expand the key phrases:

  • by parsing the main keys;
  • based on competitor analysis.

Each of them can be implemented at a simple or more complex level. Let's look at all the options.

Without using programs

A copywriter or webmaster often doesn’t want to deal with the interface of a large number of programs, but he needs good themes and key phrases for them.
This method is just for beginners and those who don’t want to bother. All actions are performed without the use of additional software, using simple and understandable services.

What you will need:

  • Keys.so service for competitor analysis – 1500 rub. Using promo code “altblog” - 15% discount;
  • Mutagen. Checking the competitiveness of requests - 30 kopecks, collecting basic and exact frequencies - 2 kopecks per check;
  • Bookvarix - free version or business account - 995 rub. (now with a discount of 695 RUR)

Option 1. Selecting a topic by parsing basic phrases:

  1. We select the main keys from the topic in a broad sense, using brainstorming and the left and right columns of Yandex Wordstat;
  2. Next, we look for synonyms for them, using the methods discussed earlier;
  3. We enter all received marker requests into Bukvariks (you will need to pay a paid tariff) in the advanced mode “Search using a list of keywords”;
  4. We indicate in the filter: “!Exact!frequency” from 50, Number of words from 3;
  5. We upload the entire list to Excel;
  6. We select all the keywords and send them for grouping to the Kulakov Clusterer service. If the site is regional, select the desired city. We leave the clustering threshold for informational sites at 2, for commercial sites we set it to 3;
  7. After grouping, we select topics for articles by looking through the resulting clusters. We take those where the number of phrases is from 3 and with a unique intent. An analysis of the URLs of sites from the top in the “Competitors” column (on the right in the sign of Kulakov’s service) helps to better understand people’s needs. Also, don’t forget to check the competitiveness of Mutagen. We run 2-3 requests from the cluster. If everything is more than 12, then the topic is not worth taking;
  8. The name of the future landing page has been decided, all that remains is to select key phrases for it;
  9. From the “Competitors” field, copy 3 URLs with the appropriate type of pages (if the site is informational, we take links to articles; if it is a commercial site, then to stores);
  10. We insert them sequentially into keys.so and upload all the key phrases for them;
  11. We combine them in Excel and remove duplicates;
  12. The service data alone is not enough, so we need to expand it. Let's use Bukvarix again;
  13. The resulting list is sent for clustering to the “Kulakov Clusterer”;
  14. We select groups of requests that are suitable for the landing page, focusing on intent;
  15. We remove the base and exact frequency through Mutagen in the “Mass Tasks” mode;
  16. We upload a list with updated data on the number of impressions in Excel. We remove zeros for both types of frequencies;
  17. Also in Excel, we add a formula for the ratio of the base frequency to the exact one and leave only those keys for which this ratio is less than 100;
  18. We delete requests for other types of content;
  19. We leave phrases that reveal the main intention as fully as possible and in different words;
  20. We repeat all the same steps in steps 8-19 for the remaining topics.

Option 2. Select a topic through competitor analysis:

1. We are looking for top sites in our field, driving in high-frequency queries and viewing the results through Arsenkin’s “Top 10 Analysis” tool. It is enough to find 1-2 suitable resources.
If we are promoting a site in a specific city, we indicate the region;
2. Go to the keys.so service and enter the urls of the sites we found and see which competitors’ pages bring the most traffic.
3. We check 3-5 of the most accurate frequency queries for competitiveness. If for all phrases it is above 12, then it is better to look for another topic that is less competitive.
4. If you need to find more sites for analysis, open the “Competitors” tab and set the parameters: similarity - 3, thematic - 10. Sort the data in descending order of traffic.
5. After we have chosen a topic, enter its name into the search results and copy 3 URLs from the top.
6. Next we repeat points 10-19 from the 1st option.

Using Key Collector or Sloboeb

This method will differ from the previous one only in the use of the Key Collector program for some operations and in a deeper expansion of the keys.

What you will need:

  • Kay Collector program – 1800 rubles;
  • all the same services as in the previous method.

"Advanced - 1"

  1. We parse the left and right columns of Yandex for the entire list of phrases;
  2. We remove the exact and basic frequency through Key Collector;
  3. We calculate the indicator key 1;
  4. We delete queries from zero and with key 1 > 100;
  5. Next, we do everything the same as in paragraphs 18-19 of option 1.

"Advanced - 2"

  1. We do steps 1-5, as in option 2;
  2. We collect keys for each URL in keys.so;
  3. Removing duplicates in Key Collector;
  4. We repeat Points 1-4, as in the “Advanced -1” method.

Now let’s compare the number of keys received and their exact total frequency when collecting CN using different methods:

As we can see from the table, the best result was shown by the alternative method of creating a core for the page - “Advanced 1.2”. It was possible to obtain 34% more target keys, and at the same time, the total traffic across the cluster was 51% more than in the case of the classic method.

Below in the screenshots you can see what the finished kernel looks like in each case. I took phrases with an exact number of impressions from 7 per month so that I could evaluate the quality of the keywords. For full semantics, see the table at the “View” link.

A)


B)


IN)

Now you know that the most common method, as everyone does, is not always the most faithful and correct, but you shouldn’t give up other methods either. Much depends on the topic itself. For commercial sites where there are not many keys, the classic option is quite sufficient. You can also get excellent results on information sites if you correctly draw up the copywriter’s technical specifications, do a good structure and SEO optimization. We will talk about all this in detail in the following articles.

3 common mistakes when creating a semantic core

1. Collecting phrases from top to bottom. It is not enough to parse Wordstat to get a good result!
More than 70% of queries that people enter rarely or periodically do not get there at all. But among them there are often key phrases with good conversion and really low competition. How not to miss them? Be sure to collect search tips and combine them with data from different sources (counters on websites, statistics services and databases).

2. Mixing information and commercial requests on one page. We have already discussed that key phrases differ according to the type of needs. If a visitor comes to your site who wants to make a purchase, and sees a page with an article as an answer to his request, do you think he will be satisfied? No! Search engines also think the same way when they rank a page, which means you can immediately forget about the top for mid-range and high-frequency phrases. Therefore, if you are in doubt about determining the type of request, look at the search results or use the Pixel Tools and Just-Magic tools to determine commerciality.

3. Choosing to promote very competitive queries. Positions for HF VC phrases depend 60-70% on behavioral factors, and to get them you need to get to the top. The more applicants, the longer the queue of applicants and the higher the requirements for sites. Everything is the same as in life or sports. Becoming a world champion is much more difficult than getting the same title in your city.
Therefore, it is better to enter a quiet niche rather than an overheated one.

Previously, getting to the top was even more difficult. At the top they stood on the principle that whoever had time, ate it. Leaders got into first place, and they could only be displaced by accumulating behavioral factors. How can you get them if you are on the second or third page... Yandex broke this vicious circle in the summer of 2015 by introducing the “multi-armed bandit” algorithm. Its essence is precisely to randomly increase and decrease the positions of sites in order to understand whether more worthy candidates have appeared to be in the top.

How much money do you need to start?

To answer this question, let’s calculate the costs of the necessary arsenal of programs and services to prepare and ungroup key phrases into 100 articles.

The bare minimum (suitable for the classic version):

1. Word fucker - free
2. Majento clusterer - free
3. For captcha recognition - 30 rubles.
4. Xml limits - 70 rub.
5. Checking the competition of a request for Mutagen - 10 checks per day for free
6. If you are not in a hurry and are willing to spend 20-30 hours on parsing, you can do without a proxy.
—————————
The result is 100 rubles. If you enter captchas yourself, and receive xml limits in exchange for those transferred from your website, then you can actually prepare the kernel for free. You just need to spend another day setting up and mastering the programs and another 3-4 days waiting for the parsing results.

Standard set of semanticist (for advanced and classical methods):

1. Kay Collector - 1900 rubles
2. Kay Assort - 1700 rubles
3. Bukvariks (business account) - 650 rubles.
4. Competitor analysis service keys.so - 1,500 rubles.
5. 5 proxies - 350 rubles per month
6. Anti-captcha - approximately 30 rubles.
7. Xml limits - about 80 rubles.
8. Checking competition with Mutagen (1 check = 30 kopecks) - we’ll keep it to 200 rubles.
———————-
The result is 6410 rubles. You can, of course, do without KeyAssort, replacing it with a Majento clusterer and using Sloboeb instead of Key Collector. Then 2810 rubles will be enough.

Should you trust the development of the kernel to a “pro” or is it better to figure it out and do it yourself?

If a person regularly does what he loves and gets better at it, then following the logic, his results should definitely be better than those of a beginner in this field. But with the selection of keywords, everything turns out exactly the opposite.

Why does a beginner do better than a professional in 90% of cases?

It's all about the approach. The task of a semanticist is not to assemble the best kernel for you, but to complete his work in the shortest possible time and so that its quality suits you.

If you do everything yourself using the algorithms discussed earlier, the result will be an order of magnitude higher for two reasons:

  • You understand the topic. This means that you know the needs of your clients or site users and will be able to maximally expand marker queries for parsing at the initial stage, using a large number of synonyms and specific words.
  • Interested in doing everything well. The owner of a business or an employee of the company in which he works will, of course, approach the issue more responsibly and try to do everything to the maximum. The more complete the core and the more low-competitive queries it contains, the more targeted traffic it will be possible to collect, which means the profit will be higher for the same investments in content.

How to find the remaining 10% that will make up the core better than you?

Look for companies where the selection of key phrases is a core competency. And you immediately discuss what result you want, like everyone else or the maximum. In the second case, it will be 2-3 times more expensive, but in the long run it will pay off many times over. For those who want to order a service from me, all the necessary information and conditions. I guarantee quality!

Why is it so important to fully develop semantics?

Here, as in any area, the principle of “good and bad choices” works. What is its essence?
Every day we are faced with what we choose:

  • meet a person who seems to be okay, but doesn’t catch you, or, having understood yourself, build a harmonious relationship with the one you need;
  • do a job you don’t like or find something you love and make it your profession;
  • renting space for a store in a non-traffic area or waiting until it becomes available is a suitable option;
  • take on the team not the best sales manager, but the one who showed himself best at today’s interview.

Everything seems to be clear. But if you look at it from the other side, imagining each choice as an investment in the future. This is where the fun begins!

Saved on this. core, 3-5 thousand. Happy as elephants! But what does this lead to next:

a) for information sites:

  • Traffic losses are at least 1.5 times with the same investments in content. Comparing different methods for obtaining key phrases, we have already found out empirically that the alternative method allows you to collect 51% more;
  • The project drops faster in search results. It’s easy for competitors to get ahead of us by giving a more complete answer in terms of intent.

b) for commercial projects:

  • Fewer leads or higher value. If we have semantics like everyone else, then we are promoting according to the same queries as our competitors. A large number of offers with constant demand reduces the share of each of them in the market;
  • Low conversion. Specific requests are better converted into sales. Saving on family kernel, we lose the most conversion keys;
  • It's harder to advance. There are many people who want to be at the top - the requirements for each of the candidates are higher.

I wish you to always make a good choice and invest only in the positive!

P.S. Bonus “How to write a good article with bad semantics”, as well as other life hacks for promoting and making money on the Internet, read in my group

Good afternoon friends.

Surely you have already forgotten the taste of my articles. The previous material was quite a long time ago, although I promised to publish articles more often than usual.

Recently the amount of work has increased. I created a new project (an information site), worked on its layout and design, collected a semantic core and began publishing material.

Today there will be very voluminous and important material for those who have been running their website for more than 6-7 months (in some topics for more than 1 year), have a large number of articles (on average 100) and have not yet reached the bar of at least 500-1000 visits per day . The numbers are taken to a minimum.

The importance of the semantic core

In some cases, poor website growth is caused by improper technical optimization of the website. More cases where the content is of poor quality. But there are even more cases when texts are not written according to requests at all - no one needs the materials. But there is also a very huge part of people who create a website, optimize everything correctly, write high-quality texts, but after 5-6 months the site only begins to gain the first 20-30 visitors from search. At a slow pace, after a year there are already 100-200 visitors and the income figure is zero.

And although everything was done correctly, there are no errors, and the texts are sometimes even many times higher quality than competitors, but somehow it doesn’t work, for the life of me. We begin to attribute this problem to the lack of links. Of course, links give a boost in development, but this is not the most important thing. And without them, you can have 1000 visits to the site in 3-4 months.

Many will say that this is all idle chatter and you won’t get such numbers so quickly. But if we look, such numbers are not achieved precisely on blogs. Information sites (not blogs), created for quick earnings and return on investment, after about 3-4 months it is quite possible to reach a daily traffic of 1000 people, and after a year - 5000-10000. The numbers, of course, depend on the competitiveness of the niche, its volume and the volume of the site itself for the specified period. But, if you take a niche with fairly little competition and a volume of 300-500 materials, then such figures within the specified time frame are quite achievable.

Why exactly do blogs not achieve such quick results? The main reason is the lack of a semantic core. Because of this, articles are written for just one specific request and almost always for a very competitive one, which prevents the page from reaching the TOP in a short time.

On blogs, as a rule, articles are written in the likeness of competitors. We have 2 readable blogs. We see that they have decent traffic, we begin to analyze their sitemap and publish texts for the same requests, which have already been rewritten hundreds of times and are very competitive. As a result, we get very high-quality content on the site, but it performs poorly in searches, because... requires a lot of age. We are scratching our heads, why is my content the best, but doesn’t make it to the TOP?

That is why I decided to write detailed material about the semantic core of the site, so that you could collect a list of queries and, what is very important, write texts for such groups of keywords that, without buying links and in just 2-3 months, reached the TOP (of course, if quality content).

The material will be difficult for most if you have never encountered this issue in its correct way. But the main thing here is to start. As soon as you start acting, everything immediately becomes clear.

Let me make a very important remark. It concerns those who are not ready to invest in quality with their hard-earned coins and always try to find free loopholes. You can’t compile semantics at high quality for free, and this is a known fact. Therefore, in this article I describe the process of collecting semantics of maximum quality. There will be no free methods or loopholes in this post! There will definitely be a new post where I’ll tell you about free and other tools with which you can collect semantics, but not in full and without the proper quality. Therefore, if you are not ready to invest in the basics of your website, then this material is of no use to you!

Despite the fact that almost every blogger writes an article about this. kernel, I can say with confidence that there are no normal free tutorials on the Internet on this topic. And if there is, then there is no one that would give a complete picture of what should be the output.

Most often, the situation ends with some newbie writing material and talking about collecting the semantic core, as well as using a service for collecting search query statistics from Yandex (wordstat.yandex.ru). Ultimately, you need to go to this site, enter queries on your topic, the service will display a list of phrases included in your entered key - this is the whole technique.

But in fact, this is not how the semantic core is assembled. In the case described above, you simply will not have a semantic core. You will receive some disconnected requests and they will all be about the same thing. For example, let’s take my niche “website building”.

What are the main queries that can be named without hesitation? Here are just a few:

  • How to create a website;
  • Website promotion;
  • Website creation;
  • Website promotion, etc.

The requests are about the same thing. Their meaning comes down to only two concepts: creation and promotion of a website.

After such a check, the wordstat service will display a lot of search queries that are included in the main queries and they will also be about the same thing. Their only difference will be in the changed word forms (adding some words and changing the arrangement of words in the query with changing endings).

Of course, it will be possible to write a certain number of texts, since requests can be different even in this option. For example:

  • How to create a wordpress website;
  • How to create a joomla website;
  • How to create a website on free hosting;
  • How to promote a website for free;
  • How to promote a service website, etc.

Obviously, separate material can be allocated for each request. But such compilation of the semantic core of the site will not be successful, because there will be no complete disclosure of information on the site in the selected niche. All content will be about only 2 topics.

Sometimes newbie bloggers describe the process of compiling a semantic core as analyzing individual queries in the Yandex Wordstat query analysis service. We enter some separate query that does not relate to the topic as a whole, but only to a specific article (for example, how to optimize an article), we get the frequency for it, and here it is - the semantic core is assembled. It turns out that in this way we must mentally identify all possible topics of articles and analyze them.

Both of the above options are incorrect, because... do not provide a complete semantic core and force you to constantly return to its compilation (second option). In addition, you will not have the site development vector in your hands and will not be able to publish materials in the first place, which should be published among the first.

Regarding the first option, when I once bought courses on website promotion, I constantly saw exactly this explanation for collecting the core of queries for a website (enter the main keys and copy all queries from the service into a text document). As a result, I was constantly tormented by the question “What to do with such requests?” The following came to mind:

  • Write many articles about the same thing, but using different keywords;
  • Enter all these keys into the description field for each material;
  • Enter all the keys from the family. kernels in the general description field for the entire resource.

None of these assumptions were correct, nor, in general, was the semantic core of the site itself.

In the final version of collecting the semantic core, we should receive not just a list of queries in the amount of 10,000, for example, but have on hand a list of groups of queries, each of which is used for a separate article.

A group can contain from 1 to 20-30 requests (sometimes 50 or even more). In this case, we use all these queries in the text, and in the future the page will bring traffic for all queries every day if it gets to 1-3 positions in the search. In addition, each group must have its own competition in order to know whether it makes sense to publish text on it now or not. If there is a lot of competition, then we can expect the effect of the page only after 1-1.5 years and with regular work to promote it (links, linking, etc.). Therefore, it is better to focus on such texts as a last resort, even if they have the most traffic.

Answers to possible questions

Question No. 1. It is clear that the output is a group of queries for writing text, and not just one key. In this case, wouldn’t the keys be similar to each other and why not write a separate text for each request?

Question No. 2. It is known that each page should be tailored to only one keyword, but here we get a whole group and, in some cases, with a fairly large content of queries. How, in this case, does the optimization of the text itself occur? After all, if there are, for example, 20 keys, then the use of each at least once in the text (even of large size) already looks like text for a search engine, and not for people.

Answer. If we take the example of requests from the previous question, then the first thing to sharpen the material will be precisely for the most frequent (1st) request, since we are most interested in its reaching the top positions. We consider this keyword to be the main one in this group.

Optimization for the main key phrase occurs in the same way as it would be done when writing text for only one key (the key in the title heading, using the required number of characters in the text and the required number of times the key itself, if required).

Regarding other keys, we also enter them, but not blindly, but based on an analysis of competitors, which can show the average number of these keys in texts from the TOP. It may happen that for most keywords you will receive zero values, which means that they do not require use in the exact occurrence.

Thus, the text is written using only the main query in the text directly. Of course, other queries can also be used if the analysis of competitors shows their presence in the texts from the TOP. But this is not 20 keywords in the text in their exact occurrence.

Most recently I published material for a group of 11 keys. It seems like there are a lot of queries, but in the screenshot below you can see that only the main most frequent key has an exact occurrence - 6 times. The remaining key phrases do not have exact occurrences, but also diluted ones (not visible in the screenshot, but this is shown during the analysis of competitors). Those. they are not used at all.

(1st column – frequency, 2nd – competitiveness, 3rd – number of impressions)

In most cases, there will be a similar situation, when only a couple of keys need to be used in an exact occurrence, and all the rest will either be greatly diluted or not used at all, even in a diluted occurrence. The article turns out to be readable and there are no hints of focusing only on search.

Question No. 3. Follows from the answer to Question No. 2. If the remaining keys in the group do not need to be used at all, then how will they receive traffic?

Answer. The fact is that by the presence of certain words in the text, a search engine can determine what the text is about. Since keywords contain certain individual words that relate only to this key, they must be used in the text a certain number of times based on the same competitor analysis.

Thus, the key will not be used in the exact entry, but the words from the key will be individually present in the text and will also take part in the ranking. As a result, for these queries the text will also be found in the search. But in this case, the number of individual words should ideally be observed. Competitors will help.

I answered the main questions that could drive you into a stupor. I will write more about how to optimize text for groups of requests in one of the following materials. There will be information about analyzing competitors and about writing the text itself for a group of keys.

Well, if you still have questions, then ask your comments. I will answer everything.

Now let's start compiling the semantic core of the site.

Very important. I won’t be able to describe the whole process as it actually is in this article (I’ll have to do a whole webinar for 2-3 hours or a mini-course), so I’ll try to be brief, but at the same time informative and touch on as many points as possible . For example, I will not describe in detail the configuration of the KeyCollector software. Everything will be cut down, but as clear as possible.

Now let's go through each point. Let's begin. First, preparation.

What is needed to collect a semantic core?


Before creating the semantic core of the site, we will set up a key collector to correctly collect statistics and parse queries.

Setting up KeyCollector

You can enter the settings by clicking on the icon in the top menu of the software.

First, what concerns parsing.

I would like to note the “Number of streams” and “Use primary IP address” settings. The number of threads to assemble one small core does not require a large number. 2-4 threads are enough. The more threads, the more proxy servers are needed. Ideally, 1 proxy per 1 thread. But you can also have 1 proxy for 2 streams (that’s how I parsed it).

Regarding the second setting, in case of parsing in 1-2 threads, you can use your main IP address. But only if it is dynamic, because... If a static individual entrepreneur is banned, you will lose access to the Yandex search engine. But still, priority is always given to using a proxy server, as it is better to protect yourself.

On the Yandex.Direct parsing settings tab, it is important to add your Yandex accounts. The more there are, the better. You can register them yourself or buy them, as I wrote earlier. I bought them because it’s easier for me to spend 100 rubles for 30 accounts.

You can add it from the buffer by copying the list of accounts in the required format in advance, or load it from a file.

Accounts must be specified in the “login:password” format, without specifying the host itself in the login (without @yandex.ru). For example, “artem_konovalov:jk8ergvgkhtf”.

Also, if we use several proxy servers, it is better to assign them to specific accounts. It would be suspicious if at first the request comes from one server and from one account, and the next time a request is made from the same Yandex account, the proxy server is different.

Next to the accounts there is a column “IP proxy”. Next to each account we enter a specific proxy. If there are 20 accounts and 2 proxy servers, then there will be 10 accounts with one proxy and 10 with another. If there are 30 accounts, then 15 with one server and 15 with another. I think you understand the logic.

If we use only one proxy, then there is no point in adding it to each account.

I talked a little earlier about the number of threads and the use of the main IP address.

The next tab is “Network”, where you need to enter proxy servers that will be used for parsing.

You need to enter a proxy at the very bottom of the tab. You can load them from the buffer in the desired format. I added it in a simple way. In each column of the line I entered information about the server that is given to you when you purchase it.

Next, we configure the export parameters. Since we need to receive all requests with their frequencies in a file on the computer, we need to set some export parameters so that there is nothing superfluous in the table.

At the very bottom of the tab (highlighted in red), you need to select the data that you want to export to the table:

  • Phrase;
  • Source;
  • Frequency "!" ;
  • The best form of the phrase (you don’t have to put it).

All that remains is to configure the anti-captcha solution. The tab is called “Antikapcha”. Select the service you are using and enter the special key that is located in the service account.

A special key for working with the service is provided in a letter after registration, but it can also be taken from the account itself in the “Settings - account settings” item.

This completes the KeyCollector settings. After making the changes, do not forget to save the settings by clicking on the large button at the bottom “Save changes”.

When everything is done and we are ready for parsing, we can begin to consider the stages of collecting the semantic core, and then go through each stage in order.

Stages of collecting the semantic core

It is impossible to obtain a high-quality and complete semantic core using only basic queries. You also need to analyze competitors’ requests and materials. Therefore, the entire process of compiling the core consists of several stages, which in turn are further divided into substages.

  1. The basis;
  2. Competitor analysis;
  3. Expansion of the finished list from stages 1-2;
  4. Collection of the best word forms for queries from stages 1-3.

Stage 1 – foundation

When collecting the kernel at this stage you need to:

  • Generate a main list of requests in the niche;
  • Expansion of these queries;
  • Cleaning.

Stage 2 – competitors

In principle, stage 1 already provides a certain volume of the core, but not fully, because we may be missing something. And competitors in our niche will help us find missing holes. Here are the steps to follow:

  • Collecting competitors based on requests from stage 1;
  • Parsing competitors' queries (site map analysis, open liveinternet statistics, analysis of domains and competitors in SpyWords);
  • Cleaning.

Stage 3 – expansion

Many people stop already at the first stage. Someone gets to the 2nd, but there are a number of additional queries that can also complement the semantic core.

  • We combine requests from stages 1-2;
  • We leave 10% of the most frequent words from the entire list, which contain at least 2 words. It is important that these 10% are no more than 100 phrases, because a large number will force you to dig deep into the process of collecting, cleaning and grouping. We need to assemble the kernel in a speed/quality ratio (minimal quality loss at maximum speed);
  • We expand these queries using the Rookee service (everything is in KeyCollector);
  • Cleaning.

Stage 4 – collecting the best word forms

The Rookee service can determine the best (correct) word form for most queries. This should also be used. The goal is not to determine the word that is more correct, but to find some more queries and their forms. In this way, you can pull up another pool of queries and use them when writing texts.

  • Combining requests from the first 3 stages;
  • Collection of the best word forms based on them;
  • Adding the best word forms to the list for all queries combined from stages 1-3;
  • Cleaning;
  • Export the finished list to a file.

As you can see, everything is not so fast, and especially not so simple. I outlined only a normal plan for compiling a kernel in order to get a high-quality list of keys at the output and not lose anything or lose as little as possible.

Now I propose to go through each point separately and study everything from A to Z. There is a lot of information, but it’s worth it if you need a really high-quality semantic core of the site.

Stage 1 – foundation

First, we create a list of basic niche queries. Typically, these are 1-3 word phrases that describe a specific niche issue. As an example, I propose to take the “Medicine” niche, and more specifically, the sub-niche of heart diseases.

What main requests can we identify? Of course, I won’t write everything, but I will give a couple.

  • Heart diseases
  • Heart attack;
  • Cardiac ischemia;
  • Arrhythmia;
  • Hypertension;
  • Heart disease;
  • Angina, etc.

In simple words, these are common names for diseases. There can be quite a lot of such requests. The more you can do, the better. But you shouldn’t enter it for show. It makes no sense to write out more specific phrases from the general ones, for example:

  • Arrhythmia;
  • Arrhythmia causes;
  • Treatment of arrhythmia;
  • Arrhythmia symptoms.

The main thing is only the first phrase. There is no point in indicating the rest, because... they will appear in the list during expansion using parsing from the left column of Yandex Wordstat.

To search for common phrases, you can use both competitor sites (site map, section names...) and the experience of a specialist in this niche.


Parsing will take some time, depending on the number of requests in the niche. All requests are by default placed in a new group called "New Group 1", if memory serves. I usually rename groups to understand which one is responsible for what. The group management menu is located to the right of the request list.

The rename function is in the context menu when you right-click. This menu will also be needed to create other groups in the second, third and fourth stages.

Therefore, you can immediately add 3 more groups by clicking on the first “+” icon so that the group is created in the list immediately after the previous one. There is no need to add anything to them yet. Let them just be.

I named the groups like this:

  • Competitors - it is clear that this group contains a list of requests that I collected from competitors;
  • 1-2 is a combined list of queries from the 1st (main list of queries) and 2nd (competitors’ queries) stages, in order to leave only 10% of the queries consisting of at least 2 words and collect extensions from them;
  • 1-3 – combined list of requests from the first, second and third (extensions) stages. We also collect the best word forms in this group, although it would be smarter to collect them in a new group (for example, the best word forms), and then, after cleaning them, move them to group 1-3.

After completing parsing from yandex.wordstat, you receive a large list of key phrases, which, as a rule (if the niche is small) will be within several thousand. Much of this is garbage and dummy requests and will have to be cleaned up. Some things will be sifted out automatically using the KeyCollector functionality, while others will have to be shoveled by hand and sitting for a while.

When all the requests are collected, you need to collect their exact frequencies. The overall frequency is collected during parsing, but the exact frequency must be collected separately.

To collect statistics on the number of impressions, you can use two functions:

  1. With the help of Yandex Direct - quickly, statistics are collected in batches, but there are limitations (for example, phrases of more than 7 words will not work, even with symbols);
  2. Using analysis in Yandex Wordstat - very slowly, phrases are analyzed one by one, but there are no restrictions.

First, collect statistics using Direct, so that this is as fast as possible, and for those phrases for which it was not possible to determine statistics using Direct, we use Wordstat. As a rule, there will be few such phrases left and they will be collected quickly.

Impression statistics are collected using Yandex.Direct by clicking on the appropriate button and assigning the necessary parameters.

After clicking on the “Get data” button, there may be a warning that Yandex Direct is not enabled in the settings. You will need to agree to activate parsing using Direct in order to begin determining frequency statistics.

You will immediately see how packs in the “Frequency” column! » accurate impression indicators for each phrase will begin to be recorded.

The process of completing a task can be seen on the “Statistics” tab at the very bottom of the program. When the task is completed, you will see a completion notification on the Event Log tab, and the progress bar on the Statistics tab will disappear.

After collecting the number of impressions using Yandex Direct, we check whether there are phrases for which the frequency was not collected. To do this, sort the “Frequency!” column. (click on it) so that the smallest or largest values ​​appear at the top.

If all zeros are at the top, then all frequencies are collected. If there are empty cells, then impressions for these phrases have not been determined. The same applies when sorting by kill, only then you will have to look at the result at the very bottom of the list of queries.

You can also start collecting using Yandex Wordstat by clicking on the icon and selecting the required frequency parameter.


After selecting the frequency type, you will see how the empty cells gradually begin to fill.

Important: do not be alarmed if empty cells remain after the end of the procedure. The fact is that they will be empty if their exact frequency is less than 30. We set this in the parsing settings in Keycollector. These phrases can be safely selected (just like regular files in Windows Explorer) and deleted. The phrases will be highlighted in blue, right-click and select “Delete selected lines.”

When all the statistics have been collected, you can start cleaning, which is very important to do at each stage of collecting the semantic core.

The main task is to remove phrases that are not related to the topic, remove phrases with stop words and get rid of queries that are too low in frequency.

The latter ideally does not need to be done, but if we are in the process of compiling this. cores to use phrases even with a minimum frequency of 5 impressions per month, this will increase the core by 50-60 percent (or maybe even 80%) and force us to dig deep. We need to get maximum speed with minimal losses.

If we want to get the most complete semantic core of the site, but at the same time collect it for about a month (we have no experience at all), then take frequencies from 4-5 monthly impressions. But it’s better (if you’re a beginner) to leave requests that have at least 30 impressions per month. Yes, we will lose a little, but this is the price for maximum speed. And as the project grows, it will be possible to receive these requests again and use them to write new materials. And this is only on the condition that all this. the core has already been written out and there are no topics for articles.

The same key collector allows you to filter requests by the number of impressions and other parameters fully automatically. Initially, I recommend doing just this, and not deleting garbage phrases and phrases with stop words, because... it will be much easier to do this when the total core volume at this stage becomes minimal. What's easier, shoveling 10,000 phrases or 2,000?

Filters are accessed from the Data tab by clicking on the Edit Filters button.

I recommend first displaying all queries with a frequency of less than 30 and moving them to a new group so as not to delete them, as they may be useful in the future. If we simply apply a filter to display phrases with a frequency of more than 30, then after the next launch of KeyCollector we will have to reapply the same filter, since everything is reset. Of course, you can save the filter, but you will still have to apply it, constantly returning to the “Data” tab.

To save ourselves from these actions, we add a condition in the filter editor so that only phrases with a frequency of less than 30 are displayed.




In the future, you can select a filter by clicking on the arrow next to the floppy disk icon.

So, after applying the filter, only phrases with a frequency of less than 30 will remain in the list of queries, i.e. 29 and below. Also, the filtered column will be highlighted in color. In the example below you will only see a frequency of 30, because... I show all this using the example of a kernel that is already ready and everything is cleaned. Don't pay any attention to this. Everything should be as I describe in the text.

To transfer, you need to select all phrases in the list. Click on the first phrase, scroll to the very bottom of the list, hold down the “Shift” key and click once on the last phrase. This way, all phrases are highlighted and marked with a blue background.

A small window will appear where you need to select the movement.

Now you can remove the filter from the frequency column so that only queries with a frequency of 30 and above remain.


We have completed a certain stage of automatic cleaning. Next you will have to tinker with deleting garbage phrases.

First, I suggest specifying stop words to remove all phrases containing them. Ideally, of course, this is done immediately at the parsing stage so that they do not end up on the list, but this is not critical, since clearing stop words occurs automatically using the KeyCollector.

The main difficulty is compiling a list of stop words, because... They are different for each topic. Therefore, whether we cleaned up stop words at the beginning or now is not so important, since we need to find all the stop words, and this is a laborious task and not so fast.

On the Internet you can find general thematic lists, which include the most common words like “abstract, free, download, p...rn, online, picture, etc.”

First, I suggest using a general topic list to further reduce the number of phrases. On the “Data Collection” tab, click on the “Stop words” button and add them in a list.

In the same window, click on the “Mark phrases in the table” button to mark all phrases containing the entered stop words. But it is necessary that the entire list of phrases in the group is unchecked, so that after clicking the button, only phrases with stop words remain marked. It's very easy to unmark all phrases.


When only marked phrases with stop words remain, we either delete them or move them to a new group. I deleted it for the first time, but it’s still a priority to create a group “With stop words” and move all unnecessary phrases to it.

After cleaning, there were even fewer phrases. But that’s not all, because... we still missed something. Both the stop words themselves and phrases that are not related to the focus of our site. These may be commercial requests for which texts cannot be written, or texts can be written, but they will not meet the user’s expectations.

Examples of such queries may be related to the word “buy”. Surely, when a user searches for something with this word, he already wants to get to the site where they sell it. We will write text for such a phrase, but the visitor will not need it. Therefore, we do not need such requests. We look for them manually.

We slowly and carefully scroll through the remaining list of queries to the very end, looking for such phrases and discovering new stop words. If you find a word that is used many times, then simply add it to the existing list of stop words and click on the “Mark phrases in the table” button. At the end of the list, when we have marked all unnecessary queries during manual checking, we delete the marked phrases and the first stage of compiling the semantic core is completed.

We have obtained a certain semantic core. It is not quite complete yet, but it will already allow you to write the maximum possible part of the texts.

All that remains is to add to it a small part of the requests that we might have missed. The following steps will help with this.

Stage 2 competitors

At the very beginning, we compiled a list of common phrases related to the niche. In our case these were:

  • Heart diseases
  • Heart attack;
  • Cardiac ischemia;
  • Arrhythmia;
  • Hypertension;
  • Heart disease;
  • Angina, etc.

All of them belong specifically to the “Heart Disease” niche. Using these phrases, you need to search to find competitor sites on this topic.

We enter each of the phrases and look for competitors. It is important that these are not general thematic ones (in our case, medical sites with a general focus, i.e. about all diseases). What is needed is niche projects. In our case - only about the heart. Well, maybe also about the vessels, because... the heart is connected to the vascular system. I think you get the idea.

If our niche is “Recipes for salads with meat,” then in the future these are the only sites we should look for. If they are not there, then try to find sites only about recipes, and not in general about cooking, where everything is about everything.

If there is a general thematic site (general medical, women's, about all types of construction and repair, cooking, sports), then you will have to suffer a lot, both in terms of compiling the semantic core itself, because you will have to work long and tediously - collect the main list of requests, wait a long time for the parsing process, clean and group.

If on the 1st, and sometimes even on the 2nd page, you cannot find narrow thematic sites of competitors, then try using not the main queries that we generated before the parsing itself at the 1st stage, but queries from the entire list after parsing. For example:

  • How to treat arrhythmia with folk remedies;
  • Symptoms of arrhythmia in women and so on.

The fact is that such queries (arrhythmia, heart disease, heart disease...) are highly competitive and it is almost impossible to get to the TOP for them. Therefore, in the first positions, and maybe even on the pages, you will quite realistically find only general thematic portals about everything due to their enormous authority in the eyes of search engines, age and reference mass.

So it makes sense to use lower-frequency phrases consisting of more words to find competitors.

We need to parse their requests. You can use the SpyWords service, but its query analysis function is available on a paid plan, which is quite expensive. Therefore, for one core there is no point in upgrading the tariff on this service. If you need to collect several cores over the course of a month, for example 5-10, then you can buy an account. But again - only if you have a budget for the PRO tariff.

You can also use Liveinternet statistics if they are open for viewing. Very often, owners make it open to advertisers, but close the “search phrases” section, which is exactly what we need. But there are still sites where this section is open to everyone. Very rare, but available.

The easiest way is to simply view the sections and site map. Sometimes we may miss not only some well-known niche phrases, but also specific requests. There may not be so much material on them and you can’t create a separate section for them, but they can add a couple of dozen articles.

When we have found another list of new phrases to collect, we launch the same collection of search phrases from the left column of Yandex Wordstat, as in the first stage. We just launch it already, being in the second group “Competitors”, so that requests are added specifically to it.

  • After parsing, we collect the exact frequencies of search phrases;
  • We set a filter and move (delete) queries with a frequency of less than 30 to a separate group;
  • We clean out garbage (stop words and queries that are not related to the niche).

So, we received another small list of queries and the semantic core became more complete.

Stage 3 – expansion

We already have a group called “1-2”. We copy phrases from the “Main list of requests” and “Competitors” groups into it. It is important to copy and not move, so that all phrases remain in the previous groups, just in case. It will be safer this way. To do this, in the phrase transfer window you need to select the “copy” option.

We received all requests from stages 1-2 in one group. Now you need to leave in this group only 10% of the most frequent queries of the total number and which contain at least 2 words. Also, there should be no more than 100 pieces. We reduce it so as not to get buried in the process of collecting the core for a month.

First, we apply a filter in which we set the condition so that at least 2-word phrases are shown.


We mark all the phrases in the remaining list. By clicking on the “Frequency!” column, we sort the phrases in descending order of number of impressions, so that the most frequent ones are at the top. Next, select the first 10% of the remaining number of queries, uncheck them (right mouse button - uncheck selected lines) and delete the marked phrases so that only these 10% remain. Don’t forget that if your 10% is more than 100 words, then we stop at line 100, no more is needed.

Now we carry out the expansion using the keycollector function. The Rookee service will help with this.


We indicate the collection parameters as in the screenshot below.

The service will collect all extensions. Among the new phrases there may be very long keys, as well as symbols, so it will not be possible to collect frequency through Yandex Direct for everyone. Then you will have to collect statistics using the button from Wordstat.

After receiving statistics, we remove requests with monthly impressions of less than 30 and carry out cleaning (stop words, garbage, keywords that are not suitable for the niche).

The stage is over. We received another list of requests.

Stage 4 – collecting the best word forms

As I said earlier, the goal is not to determine the form of the phrase that will be more correct.

In most cases (based on my experience), collecting the best word forms will point to the same phrase that is in the search list. But, without a doubt, there will also be queries for which a new word form will be indicated, which is not yet in the semantic core. These are additional key queries. At this stage we achieve the core to maximum completeness.

When assembling its core, this stage yielded another 107 additional requests.

First, we copy the keys into the “1-3” group from the “Main Queries”, “Competitors” and “1-2” groups. The sum of all requests from all previously completed stages should be obtained. Next, we use the Rookee service using the same button as the extension. Just choose another function.


The collection will begin. Phrases will be added to a new column “Best form of phrase”.

The best form will not be determined for all phrases, since the Rookee service simply does not know all the best forms. But for the majority the result will be positive.

When the process is completed, you need to add these phrases to the entire list so that they are in the same “Phrase” column. To do this, select all the phrases in the “Best phrase form” column, copy (right mouse button - copy), then click on the big green “Add phrases” button and enter them.

It is very easy to make sure that phrases appear in the general list. Since adding phrases to the table like this happens at the very bottom, we scroll to the very bottom of the list and in the “Source” column we should see the add button icon.

Phrases added using extensions will be marked with a hand icon.

Since the frequency for the best word forms has not been determined, this needs to be done. Similar to the previous stages, we collect the number of impressions. Don’t be afraid that we collect in the same group where the other requests are located. The collection will simply continue for those phrases that have empty cells.

If it is more convenient for you, then initially you can add the best word forms not to the same group where they were found, but to a new one, so that only they are there. And already in it, collect statistics, clear garbage, and so on. And only then add the remaining normal phrases to the entire list.

That's all. The semantic core of the site has been assembled. But there is still a lot of work left. Let's continue.

Before the next steps, you need to download all requests with the necessary data into an Excel file on your computer. We set the export settings earlier, so you can do it right away. The export icon in the KeyCollector main menu is responsible for this.

When you open the file, you should get 4 columns:

  1. Phrase;
  2. Source;
  3. Frequency!;
  4. The best form of the phrase.

This is our final semantic core, containing the maximum pure and necessary list of queries for writing future texts. In my case (narrow niche) there were 1848 requests, which equals approximately 250-300 materials. I can’t say for sure - I haven’t completely ungrouped all the requests yet.

For immediate use, this is still a raw option, because... requests are in a chaotic order. We also need to scatter them into groups so that each contains the keys to one article. This is the ultimate goal.

Ungrouping the semantic core

This stage is completed quite quickly, although with some difficulties. The service will help us http://kg.ppc-panel.ru/. There are other options, but we will use this one in view of the fact that with it we will do everything in the quality/speed ratio. What is needed here is not speed, but first and foremost quality.

A very useful thing about the service is that it remembers all actions in your browser using cookies. Even if you close this page or the browser as a whole, everything will be saved. This way there is no need to do everything at once and be afraid that everything may be lost in one moment. You can continue at any time. The main thing is not to clear your browser cookies.

I will show you how to use the service using the example of several fictitious queries.

We go to the service and add the entire semantic core (all queries from the excel file) exported earlier. Just copy all the keys and paste them into the window as shown in the image below.

They should appear in the left column “Keywords”.

Ignore the presence of shaded groups on the right side. These are groups from my previous core.

We look at the left column. There is an added list of queries and a “Search/Filter” line. We will use the filter.

The technology is very simple. When we enter part of a word or phrase, the service in real time leaves in the list of queries only those that contain the entered word/phrase in the query itself.

See below for more clarity.


I wanted to find all queries related to arrhythmia. I enter the word “Arrhythmia” and the service automatically leaves in the list of queries only those that contain the entered word or part of it.

The phrases will move to a group, which will be called one of the key phrases of this group.

We received a group containing all the keywords for arrhythmia. To see the contents of a group, click on it. A little later we will further divide this group into smaller groups, since there are a lot of keys with arrhythmia and they are all under different articles.

Thus, at the initial stage of grouping, you need to create large groups that combine a large number of keywords from one niche question.

If we take the same topic of heart disease as an example, then first I will create a group “arrhythmia”, then “heart disease”, then “heart attack” and so on until there are groups for each disease.

As a rule, there will be almost as many such groups as the main niche phrases generated at the 1st stage of collecting the core. But in any case, there should be more of them, since there are also phrases from stages 2-4.

Some groups may contain 1-2 keys altogether. This may be due, for example, to a very rare disease and no one knows about it or no one is looking for it. That's why there are no requests.

In general, when the main groups are created, it is necessary to break them down into smaller groups, which will be used to write individual articles.

Inside the group, next to each key phrase there is a cross; by clicking on it, the phrase is removed from the group and goes back to the ungrouped list of keywords.

This is how further grouping occurs. I'll show you with an example.

In the image you can see that there are key phrases related to arrhythmia treatment. If we want to define them in a separate group for a separate article, then we remove them from the group.


They will appear in the list in the left column.

If there are still phrases in the left column, then to find deleted keys from the group, you will have to apply a filter (use the search). If the list is completely divided into groups, then only deleted requests will be there. We mark them and click on “Create group”.


Another one will appear in the “Groups” column.

Thus, we distribute all the keys by topic and, ultimately, a separate article is written for each group.

The only difficulty in this process lies in analyzing the need to ungroup some keywords. The fact is that there are keys that are inherently different, but they do not require writing separate texts, but detailed material is written on many issues.

This is clearly expressed in medical topics. If we take the example of arrhythmia, then there is no point in making the keys “arrhythmia causes” and “arrhythmia symptoms”. The keys to treating arrhythmia are still in question.

This will be found out after analyzing the search results. We go to Yandex search and enter the analyzed key. If we see that the TOP contains articles devoted only to symptoms of arrhythmia, then we separate this key into a separate group. But, if the texts in the TOP cover all the issues (treatment, causes, symptoms, diagnosis, and so on), then ungrouping in this case is not necessary. We cover all these topics in one article.

If in Yandex exactly such texts are at the top of the search results, then this is a sign that ungrouping is not worth doing.

The same can be exemplified by the key phrase “causes of hair loss.” There may be “causes of hair loss in men” and “...in women.” Obviously, you can write a separate text for each key, based on logic. But what will Yandex say?

We enter each key and see what texts are there. There are separate detailed texts for each key, then the keys are ungrouped. If in the TOP for both queries there are general materials on the key “causes of hair loss”, within which questions regarding women and men are disclosed, then we leave the keys within one group and publish one material where topics on all keys are revealed.

This is important, since it is not for nothing that the search engine identifies texts in the TOP. If the first page contains exclusively detailed texts on a specific issue, then there is a high probability that by breaking a large topic into subtopics and writing material on each, you will not get to the TOP. And so one material has every chance of getting good positions for all requests and collecting good traffic for them.

In medical topics, a lot of attention needs to be paid to this point, which significantly complicates and slows down the ungrouping process.

At the very bottom of the “Groups” column there is an “Unload” button. Click on it and we get a new column with a text field containing all the groups, separated by a line indent.

If not all keys are in groups, then in the “Upload” field there will be no spaces between them. They only appear when ungrouping is completely completed.

Select all the words (key combination Ctrl+A), copy them and paste them into a new excel file.

Under no circumstances click on the “Clear all” button, as absolutely everything you have done will be deleted.

The ungrouping phase is over. Now you can safely write the text for each group.

But, for maximum efficiency, if your budget does not allow you to write out everything. core in a couple of days, and there is only a strictly limited opportunity for regular publication of a small number of texts (10-30 per month, for example), then it is worth determining the competition of all groups. This is important because the groups with the least competition produce results in the first 2-3-4 months after writing without any links. All you need to do is write high-quality, competitive text and optimize it correctly. Then time will do everything for you.

Definition of group competition

I would like to note right away that a low-competition request or group of requests does not mean that they are very micro low-frequency. The beauty is that there is a fairly decent number of requests that have low competition, but at the same time have a high frequency, which immediately gives such an article a place in the TOP and attracts solid traffic to one document.

For example, a very realistic picture is when a group of queries has 2-5 competition, and the frequency is about 300 impressions per month. Having written only 10 such texts, after they reach the TOP, we will receive at least 100 visitors daily. And these are only 10 texts. 50 articles - 500 visits and these figures are taken as a ceiling, since this only takes into account traffic for exact queries in the group. But traffic will also be attracted from other tails of requests, and not just from those in groups.

This is why identifying the competition is so important. Sometimes you can see a situation where there are 20-30 texts on a site, and there are already 1000 visits. And the site is young and there are no links. Now you know what this is connected with.

Request competition can be determined through the same KeyCollector absolutely free of charge. This is very convenient, but ideally this option is not correct, since the formula for determining competition is constantly changing with changes in search engine algorithms.

Better identify competition using the service http://mutagen.ru/. It is paid, but is closer to real indicators as much as possible.

100 requests cost only 30 rubles. If you have a core for 2000 requests, then the entire check will cost 600 rubles. 20 free checks are given per day (only to those who top up their balance by any amount). You can evaluate 20 phrases every day until you determine the competitiveness of the entire core. But this is very long and stupid.

Therefore, I use the mutagen and have no complaints about it. Sometimes there are problems related to processing speed, but this is not so critical, since even after closing the service page, the check continues in the background.

The analysis itself is very simple. Register on the site. We top up the balance in any convenient way and check is available at the main address (mutagen.ru). We enter a phrase into the field and it immediately begins to be evaluated.

We see that for the query being checked, the competitiveness turned out to be more than 25. This is a very high indicator and can be equal to any number. The service does not display it as real, since this does not make sense due to the fact that such competitive requests are almost impossible to promote.

The normal level of competition is considered to be up to 5. It is precisely such requests that are easily promoted without unnecessary gestures. Slightly higher indicators are also quite acceptable, but queries with values ​​greater than 5 (for example, 6-10) should be used after you have already written texts for minimal competition. The fastest possible text in the TOP is important.

Also during the assessment, the cost of a click in Yandex.Direct is determined. It can be used to estimate your future earnings. We take into account guaranteed impressions, the value of which we can safely divide by 3. In our case, we can say that one click on a Yandex Direct advertisement will bring us 1 ruble.

The service also determines the number of impressions, but we do not look at them, since the frequency of the “!request” type is not determined, but only the “request”. The indicator turns out to be inaccurate.

This analysis option is suitable if we want to analyze a single request. If you need a mass check of key phrases, then there is a special link on the main page at the top of the key entry field.


On the next page we create a new task and add a list of keys from the semantic core. We take them from the file where the keys are already ungrouped.

If there are enough funds on the balance for analysis, the check will begin immediately. The duration of the check depends on the number of keys and may take some time. I analyzed 2000 phrases for about 1 hour, when the second time checking only 30 keys took several hours.

Immediately after starting the scan, you will see the task in the list, where there will be a “Status” column. This will help you understand whether the task is ready or not.

In addition, after completing the task, you will immediately be able to download a file with a list of all phrases and the level of competition for each. The phrases will all be in the same order as they were ungrouped. The only thing is that the spaces between the groups will be removed, but this is not a big deal, because everything will be intuitive, because each group contains keys on a different topic.

In addition, if the task has not yet completed, you can go inside the task itself and see the competition results for already completed requests. Just click on the task name.

As an example, I will show the result of checking a group of queries for the semantic core. Actually, this is the group for which this article was written.

We see that almost all requests have a maximum competition of 25 or closer to it. This means that for these requests, I either won’t see the first positions at all or won’t see them for a very long time. I wouldn’t write such material on the new site at all.

Now I published it only to create quality content for the blog. Of course, my goal is to get to the top, but that’s later. If the manual reaches the first page, at least only for the main request, then I can already count on significant traffic only to this page.

The last step is to create the final file, which we will look at in the process of filling the site. The semantic core of the site has already been assembled, but managing it is not entirely convenient yet.

Creating a final file with all data

We will need the KeyCollector again, as well as the latest excel file that we received from the mutagen service.

We open the previously received file and see something like the following.

We only need 2 columns from the file:

  1. key;
  2. competition.

You can simply delete all other contents from this file so that only the necessary data remains, or you can create a completely new file, make a beautiful header line there with the names of the columns, highlighting it, for example, in green, and copy the corresponding data into each column.

Next, we copy the entire semantic core from this document and add it to the key collector again. Frequencies will need to be collected again. This is necessary so that the frequencies are collected in the same order as the key phrases are located. Previously, we collected them to weed out garbage, and now to create the final file. Of course, we add phrases to a new group.

When the frequencies are collected, we export the file from the entire frequency column and copy it into the final file, which will have 3 columns:

  1. key phrase;
  2. competitiveness;
  3. frequency.

Now all that’s left to do is sit a little and do the math for each group:

  • Average group frequency - it is not the indicator of each key that is important, but the average indicator of the group. It is calculated as the usual arithmetic mean (in Excel - the "AVERAGE" function);
  • Divide the total frequency of the group by 3 to bring the possible traffic to real numbers.

So that you don’t have to worry about mastering calculations in Excel, I have prepared a file for you where you just need to enter data in the required columns and everything will be calculated fully automatically. Each group must be calculated separately.

There will be a simple calculation example inside.

As you can see, everything is very simple. From the final file, simply copy all the entire phrases of the group with all the indicators and paste them into this file in place of the previous phrases with their data. And the calculation will happen automatically.

It is important that the columns in the final file and in mine are in exactly this order. First the phrase, then the frequency, and only then the competition. Otherwise, you will count nonsense.

Next, you will have to sit a little to create another file or sheet (I recommend) inside the file with all the data about each group. This sheet will not contain all the phrases, but only the main phrase of the group (we will use it to determine what kind of group it is) and the calculated values ​​of the group from my file. This is a kind of final file with data about each group with calculated values. This is what I look at when choosing texts for publication.

You will get the same 3 columns, only without any calculations.

In the first one we insert the main phrase of the group. In principle, any one is possible. It is only needed to copy it and, through a search, find the location of all the group keys that are on another sheet.

In the second we copy the calculated frequency value from my file, and in the third we copy the average value of the group’s competition. We take these numbers from the “Result” line.

The result will be the following.

This is the 1st sheet. The second sheet contains all the contents of the final file, i.e. all phrases with indicators of their frequency and competition.

Now we look at the 1st sheet. We select the most “cost-effective” group of keys. We copy the phrase from the 1st column and find it using search (Ctrl + F) in the second sheet, where the rest of the group phrases will be located next to it.

That's all. The manual has come to an end. At first glance, everything is very complicated, but in reality it is quite simple. As I already said at the very beginning of the article, you just have to start doing it. In the future I plan to make a video manual using these instructions.

Well, on this note I end the instructions.

All friends. If you have any suggestions or questions, I'm waiting for you in the comments. See you later.

P.S. Record for volume of content. In Word it turned out to be more than 50 sheets of small print. It was possible not to publish an article, but to create a book.

Best regards, Konstantin Khmelev!

At the moment, factors such as content and structure play the most important role for search engine promotion. However, how to understand what to write text about, what sections and pages to create on the site? In addition to this, you need to find out exactly what the target visitor to your resource is interested in. To answer all these questions you need to collect a semantic core.

Semantic core— a list of words or phrases that fully reflect the theme of your site.

In the article I will tell you how to pick it up, clean it and break it down into structure. The result will be a complete structure with queries clustered across pages.

Here is an example of a query core broken down into a structure:


By clustering I mean breaking your search queries down into separate pages. This method will be relevant for both promotion in Yandex and Google PS. In this article I will describe a completely free way to create a semantic core, but I will also show options with various paid services.

After reading the article, you will learn

  • Choose the right queries for your topic
  • Collect the most complete core of phrases
  • Clean up uninteresting requests
  • Group and create structure

Having collected the semantic core you can

  • Create a meaningful structure on the site
  • Create a multi-level menu
  • Fill pages with texts and write meta descriptions and titles on them
  • Collect your website's positions for queries from search engines

Collection and clustering of the semantic core

Correct compilation for Google and Yandex begins with identifying the main key phrases of your topic. As an example, I will demonstrate its composition using a fictitious online clothing store. There are three ways to collect the semantic core:

  1. Manual. Using the Yandex Wordstat service, you enter your keywords and manually select the phrases you need. This method is quite fast if you need to collect keys on one page, however, there are two disadvantages.
    • The accuracy of the method is poor. You may always miss some important words if you use this method.
    • You will not be able to assemble a semantic core for a large online store, although you can use the Yandex Wordstat Assistant plugin to simplify it - this will not solve the problem.
  2. Semi-automatic. In this method, I assume using a program to collect the kernel and further manually breaking it down into sections, subsections, pages, etc. This method of compiling and clustering the semantic core, in my opinion, is the most effective because has a number of advantages:
    • Maximum coverage of all topics.
    • Qualitative breakdown
  3. Auto. Nowadays, there are several services that offer fully automatic kernel collection or clustering of your requests. Fully automatic option - I do not recommend using it, because... The quality of collection and clustering of the semantic core is currently quite low. Automatic query clustering is gaining popularity and has its place, but you still need to merge some pages manually, because the system does not provide an ideal ready-made solution. And in my opinion, you will simply get confused and will not be able to immerse yourself in the project.

To compile and cluster a full-fledged correct semantic core for any project, in 90% of cases I use a semi-automatic method.

So, in order we need to follow these steps:

  1. Selection of queries for topics
  2. Collecting the kernel based on requests
  3. Cleaning up non-target requests
  4. Clustering (breaking phrases into structure)

I showed an example of selecting a semantic core and grouping into a structure above. Let me remind you that we have an online clothing store, so let’s start looking at point 1.

1. Selection of phrases for your topic

At this stage we will need the Yandex Wordstat tool, your competitors and logic. In this step, it is important to collect a list of phrases that are thematic high-frequency queries.

How to select queries to collect semantics from Yandex Wordstat

Go to the service, select the city(s)/region(s) you need, enter the most “fat” queries in your opinion and look at the right column. There you will find the thematic words you need, both for other sections, and frequency synonyms for the entered phrase.

How to select queries before compiling a semantic core using competitors

Enter the most popular queries into the search engine and select one of the most popular sites, many of which you most likely already know.

Pay attention to the main sections and save the phrases you need.

At this stage, it is important to do the right thing: to cover as much as possible all possible words from your topic and not miss anything, then your semantic core will be as complete as possible.

Applying to our example, we need to create a list of the following phrases/keywords:

  • Cloth
  • Shoes
  • Boots
  • Dresses
  • T-shirts
  • Underwear
  • Shorts

What phrases are pointless to enter?: women's clothing, buy shoes, prom dress, etc. Why?— These phrases are the “tails” of the queries “clothes”, “shoes”, “dresses” and will be added to the semantic core automatically at the 2nd stage of collection. Those. you can add them, but it will be pointless double work.

What keys do I need to enter?“low boots”, “boots” are not the same thing as “boots”. It is the word form that is important, not whether these words have the same root or not.

For some, the list of key phrases will be long, but for others it consists of one word - don’t be alarmed. For example, for an online store of doors, the word “door” may well be enough to compile a semantic core.

And so, at the end of this step we should have a list like this.

2. Collecting queries for the semantic core

For proper, full collection, we need a program. I will show an example using two programs simultaneously:

  • On the paid version - KeyCollector. For those who have it or want to buy it.
  • Free - Slovoeb. A free program for those who are not ready to spend money.

Open the program

Create a new project and call it, for example, Mysite

Now to further collect the semantic core, we need to do several things:

Create a new account on Yandex mail (the old one is not recommended due to the fact that it can be banned for many requests). So you created an account, for example [email protected] with password super2018. Now you need to specify this account in the settings as ivan.ivanov:super2018 and click the “save changes” button below. More details in the screenshots.

We select a region to compile a semantic core. You need to select only those regions in which you are going to promote and click save. The frequency of requests and whether they will be included in the collection in principle will depend on this.

All settings are completed, all that remains is to add our list of key phrases prepared in the first step and click the “start collecting” button of the semantic core.

The process is completely automatic and quite long. You can make coffee for now, but if the topic is broad, for example, like the one we are collecting, then this will last for several hours 😉

Once all the phrases are collected, you will see something like this:

And this stage is over - let's move on to the next step.

3. Cleaning the semantic core

First, we need to remove requests that are not interesting to us (non-target):

  • Related to another brand, for example, Gloria Jeans, Ecco
  • Information queries, for example, “I wear boots”, “jeans size”
  • Similar in topic, but not related to your business, for example, “used clothing”, “clothing wholesale”
  • Queries that are in no way related to the topic, for example, “Sims dresses”, “puss in boots” (there are quite a lot of such queries after selecting the semantic core)
  • Requests from other regions, metro, districts, streets (it doesn’t matter which region you collected requests for - another region still comes across)

Cleaning must be done manually as follows:

We enter a word, press “Enter”, if in our created semantic core it finds exactly the phrases that we need, select what we found and press delete.

I recommend entering the word not as a whole, but using a construction without prepositions and endings, i.e. if we write the word “glory”, it will find the phrases “buy jeans at Gloria” and “buy jeans at Gloria”. If you spelled "gloria" - "gloria" would not be found.

Thus, you need to go through all the points and remove unnecessary queries from the semantic core. This may take a significant amount of time, and you may end up deleting most of the collected queries, but the result will be a complete, clean and correct list of all possible promoted queries for your site.

Now upload all your queries to excel

You can also remove non-target queries from semantics en masse, provided you have a list. This can be done using stop words, and this is easy to do for a typical group of words with cities, subways, and streets. You can download a list of words that I use at the bottom of the page.

4. Clustering of the semantic core

This is the most important and interesting part - we need to divide our requests into pages and sections, which together will create the structure of your site. A little theory - what to follow when separating requests:

  • Competitors. You can pay attention to how the semantic core of your competitors from the TOP is clustered and do the same, at least with the main sections. And also see which pages are in the search results for low-frequency queries. For example, if you are not sure whether or not to create a separate section for the query “red leather skirts,” then enter the phrase into the search engine and look at the results. If the search results contain resources with such sections, then it makes sense to make a separate page.
  • Logics. Do the entire grouping of the semantic core using logic: the structure should be clear and represent in your head a structured tree of pages with categories and subcategories.

And a couple more tips:

  • It is not recommended to place less than 3 requests per page.
  • Don’t make too many levels of nesting, try to have 3-4 of them (site.ru/category/subcategory/sub-subcategory)
  • Do not make long URLs, if you have many levels of nesting when clustering the semantic core, try to shorten the urls of categories high in the hierarchy, i.e. instead of “your-site.ru/zhenskaya-odezhda/palto-dlya-zhenshin/krasnoe-palto” do “your-site.ru/zhenshinam/palto/krasnoe”

Now to practice

Kernel clustering as an example

To begin with, let’s categorize all requests into main categories. Looking at the logic of competitors, the main categories for a clothing store will be: men's clothing, women's clothing, children's clothing, as well as a bunch of other categories that are not tied to gender/age, such as simply “shoes”, “outerwear”.

We group the semantic core using Excel. Open our file and act:

  1. We break it down into main sections
  2. Take one section and break it into subsections

I will show you the example of one section - men's clothing and its subsection. In order to separate some keys from others, you need to select the entire sheet and click conditional formatting->cell selection rules->text contains

Now in the window that opens, write “husband” and press enter.

Now all our keys for men's clothing are highlighted. It is enough to use a filter to separate the selected keys from the rest of our collected semantic core.

So let’s turn on the filter: you need to select the column with queries and click sort and filter->filter

And now let's sort

Create a separate sheet. Cut the highlighted lines and paste them there. You will need to split the kernel in the future using this method.

Change the name of this sheet to “Men’s clothing”, a sheet where the rest of the semantic core is called “All queries”. Then create another sheet, call it “Structure” and put it as the very first one. On the structure page, create a tree. You should get something like this:

Now we need to divide the large men's clothing section into subsections and sub-subsections.

For ease of use and navigation through your clustered semantic core, provide links from the structure to the appropriate sheets. To do this, right-click on the desired item in the structure and do as in the screenshot.

And now you need to methodically separate the requests manually, simultaneously deleting what you may not have been able to notice and delete at the kernel cleaning stage. Ultimately, thanks to clustering of the semantic core, you should end up with a structure similar to this:

So. What we learned to do:

  • Select the queries we need to collect the semantic core
  • Collect all possible phrases for these queries
  • Clean out "garbage"
  • Cluster and create structure

What, thanks to the creation of such a clustered semantic core, can you do next:

  • Create a structure on the site
  • Create a menu
  • Write texts, meta descriptions, titles
  • Collect positions to track dynamics of requests

Now a little about programs and services

Programs for collecting the semantic core

Here I will describe not only programs, but also plugins and online services that I use

  • Yandex Wordstat Assistant is a plugin that makes it convenient to select queries from Wordstat. Great for quickly compiling the core of a small site or 1 page.
  • Keycollector (word - free version) is a full-fledged program for clustering and creating a semantic core. It is very popular. A huge amount of functionality in addition to the main direction: Selection of keys from a bunch of other systems, the possibility of auto-clustering, collecting positions in Yandex and Google and much more.
  • Just-magic is a multifunctional online service for compiling a kernel, auto-breaking, checking the quality of texts and other functions. The service is shareware; to fully operate, you need to pay a subscription fee.

Thank you for reading the article. Thanks to this step-by-step manual, you will be able to create the semantic core of your website for promotion in Yandex and Google. If you have any questions, ask in the comments. Below are the bonuses.

03.08.2018 Reading time: 5 minutes

What is the semantic core?

The semantic core is a set of search phrases and words used to promote the site. These search words and phrases help robots determine the topic of a page or an entire service, that is, find out what the company does.

In the Russian language, semantics is a branch of the science of language that studies the semantic content of lexical units of a language. In relation to search engine optimization, this means that the semantic core is the semantic content of the resource. It helps to decide what information to convey to users and in what manner. Therefore, semantics is the foundation, the basis of all SEO.

Why do you need a semantic core of a website and how to use it?

  • The correct semantic core is necessary to accurately calculate the cost of promotion.
  • Semantics is a vector for building internal SEO optimization: the most relevant queries are selected for each service or product so that users and search robots can find them better.
  • Based on it, the site structure and texts for thematic pages are created.
  • Keys from semantics are used to write snippets (short descriptions of the page).

Here is the semantic core - an example of how it was compiled in a company website for a construction company website:

The optimizer collects semantics, parses it into logical blocks, finds out the number of impressions and, based on the cost of queries in the top Yandex and Google, calculates the total cost of promotion.

Of course, when selecting a semantic core, the specifics of the company’s work are taken into account: for example, if the company did not design and build houses from laminated veneer lumber, we would delete the corresponding queries and not use them in the future. Therefore, an obligatory stage of working with semantics is its coordination with the customer: no one knows the specifics of the company’s work better than him.

Types of Keywords

There are several parameters by which key queries are classified.

  1. By frequency
    • high-frequency – words and phrases with a frequency of 1000 impressions per month;
    • mid-frequency – up to 1000 impressions per month;
    • low-frequency – up to 100 impressions.
  2. Collecting keyword frequency helps you find out what users are asking for most often. But a high-frequency query is not necessarily a highly competitive query, and composing semantics with high frequency and low competitiveness is one of the main aspects in working with the semantic core.

  3. Type:
    • geo-dependent and non-geo-dependent - promotions tied to the region and not tied;
    • informational – from them the user receives some information. Keys of this type are usually used in articles - for example, reviews or useful tips;
    • branded – contain the name of the promoted brand;
    • transactional – implying an action from the user (buy, download, order) and so on.
  4. Other types are those that are difficult to classify as any type: for example, a “profiled beam” key. By typing such a query into a search engine, the user can mean anything: purchasing timber, properties, comparisons with other materials, etc.

    From the experience of our company, we can say that it is very difficult to promote any website based on such requests - as a rule, they are high-frequency and highly competitive, and this is not only difficult to optimize, but also expensive for the client.

How to collect a semantic core for a website?

  • By analyzing competitor sites (in SEMrush, SerpStat you can see the semantic core of competitors):

The process of compiling a semantic core

The collected queries are not yet a semantic core; here we still need to separate the wheat from the chaff so that all queries are relevant to the client’s services.

To create a semantic core, queries need to be clustered (divided into blocks according to the logic of service provision). This can be done using programs (for example, KeyAssort or TopSite) - especially if the semantics are voluminous. Or manually evaluate and iterate through the entire list and remove unsuitable queries.

Then send it to the client and check if there are any errors.

A ready-made semantic core is a yellow brick path to the content plan, blog articles, texts for product cards, company news, and so on. This is a table of audience needs that you can satisfy using your website.

  • Distribute the keys across pages.
  • Use keywords in meta tags , <description>, <h>(especially in the first level H1 heading).</li> <li>Insert keys into texts for pages. This is one of the white hat optimization methods, but it is important not to overdo it: overspam can result in search engine filters.</li> <li>Save the remaining search queries and those that do not fit into any section under the title “What else to write about.” You can use them for informational articles in the future.</li> <li>And remember: you need to focus on the requests and interests of users, so trying to cram all the keys into one text is pointless</li> </ul><h2>Collecting a semantic core for a website: main mistakes</h2> <ul><li>Refusal of highly competitive keys. Yes, perhaps, to the top upon request <i>“buy profiled timber”</i> you won’t get it (and it won’t stop you from successfully selling your services), but you still need to include it in your texts.</li> <li>Refusal of low frequencies. This is wrong for the same reason as rejecting highly competitive requests.</li> <li>Creating pages for requests and for the sake of requests. <i>"Buy profiled timber"</i> And <i>“order profiled timber”</i>- essentially the same thing, there is no point in splitting them into separate pages.</li> <li>Absolute and unconditional trust in the software. You can’t do without SEO programs, but manual analysis and data verification are necessary. And no program can yet assess the industry, the level of competition and distribute keys without errors.</li> <li>Keys are our everything. No, our everything is a convenient, understandable website and useful content. Any text needs keys, but if the text is bad, then the keys will not save you.</li> </ul> <script type="text/javascript"> <!-- var _acic={dataProvider:10};(function(){var e=document.createElement("script");e.type="text/javascript";e.async=true;e.src="https://www.acint.net/aci.js";var t=document.getElementsByTagName("script")[0];t.parentNode.insertBefore(e,t)})() //--> </script><br> <br> <script>document.write("<img style='display:none;' src='//counter.yadro.ru/hit;artfast?t44.1;r"+ escape(document.referrer)+((typeof(screen)=="undefined")?"": ";s"+screen.width+"*"+screen.height+"*"+(screen.colorDepth? screen.colorDepth:screen.pixelDepth))+";u"+escape(document.URL)+";h"+escape(document.title.substring(0,150))+ ";"+Math.random()+ "border='0' width='1' height='1' loading=lazy loading=lazy>");</script> <p> </p> <p>It might be useful to read:</p> <ul> <li><a href="https://zistons.ru/en/kommutator-audio-vhodov-avtomaticheskii-kommutator-audio-signalov-podryvy-v.html">Automatic audio switcher</a>;</li> <li><a href="https://zistons.ru/en/pochemu-na-tele-2-ne-pereneslis-gigabaity-operator-bolshoi.html">The Big Three operator will launch the transfer of remaining traffic, minutes and SMS in package tariffs</a>;</li> <li><a href="https://zistons.ru/en/kakoi-samyi-luchshii-antivirus-dlya-kompyutera-luchshie-besplatnye.html">The best free antiviruses Which antivirus is better to download for your computer</a>;</li> <li><a href="https://zistons.ru/en/kak-ubrat-nenuzhnuyu-ten-v-fotoshope-kak-ubrat-ten-na-foto-v-fotoshope.html">How to remove a shadow from a photo in Photoshop</a>;</li> <li><a href="https://zistons.ru/en/chto-takoe-semanticheskoe-yadro-kak-sostavit-effektivnoe-semanticheskoe-yadro-kak.html">How to create an effective semantic core</a>;</li> <li><a href="https://zistons.ru/en/instrukciya-po-shagam-dlya-sostavleniya-semanticheskogo-yadra-prostoi-primer.html">A simple example of compiling a semantic core Query core</a>;</li> <li><a href="https://zistons.ru/en/kak-vernut-staryi-vid-vk-kak-izmenit-novyi-dizain-vk-na-staryi-za.html">How to change a new VK design to an old one in a matter of seconds?</a>;</li> <li><a href="https://zistons.ru/en/kto-takie-illyuminaty-i-sionisty-sionist-kto-eto-takoi-chem-zanimaetsya.html">Zionist: who is he and what does he do?</a>;</li> </ul> <p> </p> <p> <center> </center> </p> </div> </article> </main> <div style="clear: both;"></div> </div> </div> <div id="secondary-left" class="secondary widget-area col-xs-12 col-sm-4 col-md-3 col-lg-3 pull-left" role="complementary"> <div class="block-secondary inner-menu"> <aside id="dc_jqaccordion_widget-4" class="widget "> <div class="dcjq-accordion" id="dc_jqaccordion_widget-4-item"> <ul id="menu-informaciya-dlya-perevozchika-rus" class="menu"> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/smart-tv/">Smart TV</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/windows/">Windows</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/antiviruses/">Antiviruses</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/security/">Safety</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/recovery/">Recovery</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/data/">Data</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/internet/">Internet</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/computers/">Computers</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/mobile-devices/">Mobile devices</a></li> <li id="menu-item-" class="menu-item menu-item-type-taxonomy menu-item-object-category current-menu-item menu-item-"><a href="https://zistons.ru/en/category/setting/">Settings</a></li> </ul> </div> </aside> <aside id="text-11" class="widget widget_text"> <div class="h1 widget-title">New articles</div> <div class="textwidget"> <ul> <li><a href="https://zistons.ru/en/nakrutka-podpischikov-v-instagram-programmy-prilozheniya-i-servisy-kak.html">How to gain live followers on Instagram without cheating?</a></li> <li><a href="https://zistons.ru/en/risunok-iz-smailikov-vkontakte-serdce-ispolzovanie.html">Using hearts from VKontakte emoticons</a></li> <li><a href="https://zistons.ru/en/likeme---rasshirenie-dlya-besplatnoi-nakrutki-laikov-vk-pomoshch--.html">Help - General questions - LikeMe Extension for Yandex browser to get likes</a></li> <li><a href="https://zistons.ru/en/html5-ot-youtube-kak-nastroit-flash-player-v-yandeks-brauzere-vklyuchenie.html">Html5 from youtube how to configure</a></li> <li><a href="https://zistons.ru/en/sekretnyi-chat-v-telegram-sekretnyi-chat-v-telegram-kak-sdelat-papki-v.html">Secret chat in Telegram How to make folders in Telegram</a></li> <li><a href="https://zistons.ru/en/na-chto-vliyaet-chastota-processora-taktovaya-chastota-processora.html">The processor clock speed is one of the most important components of a computer.</a></li> <li><a href="https://zistons.ru/en/kak-sozdat-tochku-dostupa-wi-fi-na-noutbuke-s-windows-xp-kak-ne-kruti-besplatnost.html">Whatever you say, free is captivating</a></li> <li><a href="https://zistons.ru/en/kak-izmenit-format-izobrazheniya-izmenyaem-format-kartinki-na-jpg-ili-lyuboi.html">We change the image format to jpg or any other in three different ways. How to change the image of a file</a></li> <li><a href="https://zistons.ru/en/pochemu-ne-ustanavlivayutsya-draivera-na-soni-vaio-ustanovka.html">Installing drivers on Sony Vaio</a></li> <li><a href="https://zistons.ru/en/kak-vosstanovit-dannye-s-fleshki-ili-diska-posle-formatirovaniya.html">Restoring a flash drive after formatting: instructions How to get information back after formatting a flash drive</a></li> </ul> </div> </aside> <aside id="text-11" class="widget widget_text"> </aside> </div> </div> </div> <div style="clear: both;"></div> </div> </div> <footer id="colophon" class="site-footer" role="contentinfo"> <div class="container"> <div id="footer-sidebar" class="row"> <div class="footer-column col-xs-12 col-sm-3 col-md-3 col-lg-3"> <aside id="text-14" class="widget widget_text"> <div class="textwidget">2024 zistons.ru <br>Computer magazine - Zistons</div> </aside> </div> <div style="clear: both;"></div> </div> </div> </footer> </div> <script type="text/javascript"> jQuery(document).ready(function($) { jQuery('#dc_jqaccordion_widget-2-item .menu').dcAccordion({ eventType: 'click', hoverDelay: 0, menuClose: false, autoClose: true, saveState: false, autoExpand: true, classExpand: 'current-menu-item', classDisable: '', showCount: false, disableLink: true, cookie: 'dc_jqaccordion_widget-2', speed: 'normal' } ); } ); </script> <script type="text/javascript"> jQuery(document).ready(function($) { jQuery('#dc_jqaccordion_widget-3-item .menu').dcAccordion({ eventType: 'click', hoverDelay: 0, menuClose: false, autoClose: true, saveState: false, autoExpand: true, classExpand: 'current-menu-item', classDisable: '', showCount: false, disableLink: true, cookie: 'dc_jqaccordion_widget-3', speed: 'normal' } ); } ); </script> <script type="text/javascript"> jQuery(document).ready(function($) { jQuery('#dc_jqaccordion_widget-4-item .menu').dcAccordion({ eventType: 'click', hoverDelay: 0, menuClose: false, autoClose: true, saveState: false, autoExpand: true, classExpand: 'current-menu-item', classDisable: '', showCount: false, disableLink: true, cookie: 'dc_jqaccordion_widget-4', speed: 'normal' } ); } ); </script> <script type="text/javascript"> jQuery(document).ready(function($) { jQuery('#dc_jqaccordion_widget-5-item .menu').dcAccordion({ eventType: 'click', hoverDelay: 0, menuClose: false, autoClose: true, saveState: false, autoExpand: true, classExpand: 'current-menu-item', classDisable: '', showCount: false, disableLink: true, cookie: 'dc_jqaccordion_widget-5', speed: 'normal' } ); } ); </script> <script type="text/javascript"> jQuery(document).ready(function($) { jQuery('#dc_jqaccordion_widget-6-item .menu').dcAccordion({ eventType: 'click', hoverDelay: 0, menuClose: false, autoClose: true, saveState: false, autoExpand: true, classExpand: 'current-menu-item', classDisable: '', showCount: false, disableLink: true, cookie: 'dc_jqaccordion_widget-6', speed: 'normal' } ); } ); </script> <script type="text/javascript"> jQuery(document).ready(function($){ $("a[rel*=lightbox]").colorbox({ initialWidth:"30%",initialHeight:"30%",maxWidth:"90%",maxHeight:"90%",opacity:0.8} ); } ); </script> <script type='text/javascript' src='https://zistons.ru/wp-content/themes/inkness/js/navigation.js?ver=20120206'></script> <script type='text/javascript' src='https://zistons.ru/wp-content/themes/inkness/js/skip-link-focus-fix.js?ver=20130115'></script> <script type='text/javascript' src='https://zistons.ru/wp-content/plugins/lightbox-plus/js/jquery.colorbox.1.5.9-min.js?ver=1.5.9'></script> <script type='text/javascript' src='/wp-includes/js/wp-embed.min.js?ver=4.4.13'></script> <script type="text/javascript"> <!-- var _acic={dataProvider:10};(function(){var e=document.createElement("script");e.type="text/javascript";e.async=true;e.src="https://www.acint.net/aci.js";var t=document.getElementsByTagName("script")[0];t.parentNode.insertBefore(e,t)})() //--> </script><br> <br> </body> </html><script data-cfasync="false" src="/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js"></script>