Site icon DataForSEO

What are “similarity keywords”, and how can you identify them in Google Ads API?

In some cases, the Google Ads API can group keywords that are too similar to each other and aggregate their metrics under the first keyword of the group. In this case, the API response returns data for only the first keyword of the group, and all other similar keywords are listed in a separate close_variants array. You can return the close_variants array in the API response by specifying the return_close_variants with true.

These “similarity keywords” or “close variants” usually have the same search volume values, CPC, monthly searches, etc. Thus, such keywords are grouped by Google Ads. For example, keywords “sale agreement format” and “format of sale agreement” have identical values, as you can see below:

We return such words as separate objects in the API response for convenience. However, in some cases, Google Ads may consider certain keywords to be part of the same close_variants group, despite having different values. The following example illustrates this clearly.

Suppose you have a list of such keywords: sale agreement format, format of sale agreement, agreement for sale format, format for sale agreement, sale agreement formats, format sale agreement.

If you send each of these keywords in a separate request to the Search Volume endpoint, you’ll notice that some of them return different values for search volume, CPC, and other metrics. However, if you include all these keywords in a single request, you will only receive data for the first keyword in the array. The remaining keywords will be grouped in a close_variants array without any associated data.

Example of one of the batch requests:

[
  {
    "language_code": "en",
    "location_code": 2840,
    "keywords": [
      "sale agreement format",
      "format of sale agreement",
      "agreement for sale format",
      "format for sale agreement",
      "sale agreement format",
      "format sale agreement"
    ],
    "date_from": "2025-03-01",
    "return_close_variants": true
  }
]

Results of three separate batch requests with changed keyword order:

For instance, if you specify “sale agreement format” as the first keyword in the keywords array of the request, the response will only provide data for this keyword, while the other keywords will be grouped in the close_variants array. The same occurs when you change the order of the keywords in the array – data will only be returned for the first keyword. This happens because Google Ads considers all these keywords as “close variants” of the first keyword.

Although such cases are rare when working with diverse and extensive keyword datasets, they can lead to poor data quality and incomplete responses when they do occur. As a result, you may not receive data for certain grouped keywords. Additionally, this can be an obstacle if you aim to perform granular keyword research and retrieve precise values for each term.

Fortunately, we have two approaches for identifying “similarity keywords” and preventing keyword grouping in API responses.

Addressing the “similarity keyword” issue in Google Ads API

1 Specify keywords in a random order in API requests

If you use a separate SQL keyword database to send keywords, ensure that you don’t use ordering options like sending keywords in alphabetical order. Doing so increases the risk of keyword grouping by close variants in the response. For example, avoid using queries like:

SELECT * FROM keywords ORDER BY keyword_name

Instead, apply random ordering to the keywords you want to send:

SELECT * FROM keywords ORDER BY random()

However, it’s important to note that this method can cause additional load on the database, and it doesn’t guarantee the complete elimination of similar queries in the API request.

2 Identify “similarity keywords” before sending a request

Another option is to identify and remove “similarity keywords” beforehand using functions that determine the similarity between keywords. Such functions are available in many programming languages, including PHP.

Similarity functions can display the similarity percentage of other keywords in the dataset compared to a specific keyword. For example, here is a similarity comparison of the keyword “sale agreement format” with other keywords in the payload:

Based on extensive testing, our team concluded that the similarity level must be below 40% to prevent keyword grouping in the API response.

To implement this approach, you can follow the algorithm below:

1. Select more keywords from the database than required for the payload.
2. Add the first keyword to the request.
3. Compare the second keyword with the first keyword. If the similarity is less than 40%, add the second keyword to the request. If the similarity is greater than 40%, add the second keyword to a different request.
4. Compare the third keyword with the first and second keywords already added to the request. If the similarity with either the first or second keyword is less than 40%, add the third keyword to the request. If the similarity is greater than 40%, add the third keyword to a different request.
5. Repeat this process for the remaining keywords in the dataset.

Below is an example of this algorithm written in PHP:

<?php

function createPayload(array $keywords, int $threshold): array {
    $payload = [];
    $payload[] = $keywords[0];
    for ($i = 1; $i < count($keywords); $i++) {
        $addToPayload = true;
        foreach ($payload as $existingKeyword) {
            $similarity = 0;
            similar_text($keywords[$i], $existingKeyword, $similarity);
            if ($similarity >= $threshold) {
                $addToPayload = false;
                break;
            }
        }
        if ($addToPayload) {
            $payload[] = $keywords[$i];
        }
    }
    return $payload;
}

$keywords = [
    "sale agreement format",
    "format of sale agreement",
    "agreement for sale format",
    "format for sale agreement",
    "sale agreement formats",
    "format sale agreement",
];

$keywordsForPayload = createPayload($keywords, 40);

Using this algorithm, you can effectively group similar keywords into separate requests, ensuring that each request contains keywords with a similarity level below the 40% threshold.

As you can see, the “similarity keyword” issue is not particularly common, but if it occurs, it may cause problems with data retrieval quality. Fortunately, you can effectively address this issue by following our abovementioned approaches. If you encounter any other problems or have additional questions, don’t hesitate to contact our 24/7 support team.

Exit mobile version