On today’s post I am going to show you how you can use Keyword Planner API and Python to extract keyword ideas segmented by language with a practical case. From my point of view, being able to segment by language when doing a keyword research is a very interesting feature that other tools do not offer which enables you to find potential keyword niches for minority languages in very competitive markets.

The process that we will follow is:

  1. Importing the seed terms: to be able to get the keyword ideas we need a list of seed terms that we are going to obtain from Github based on their frequency. In this way, we can obtain most of the queries that are done in a country for a specific language and create a quite extensive database. However, if you would like to do your keyword research focused on an industry rather than a generic one, then you might need to use a list of more specific terms of that industry.
  2. Making the requests to the API: for this practical example we will insert a list of seed terms in Spanish and we will extract the keyword ideas of queries that are done in Spanish in the United States through Keyword Planner API and Python.
  3. Exporting to Excel file: finally, we can dump all the data into an Excel file if we would like to do further analyses.

This is just a practical example, but if you would like to find more information about how to set up Keyword Planner API, you can have a read at the guide that I published to use Keyword Planner API with Python.

1.- Finding and importing the seed terms

As mentioned before, we will get the list of seed terms from Github, thanks to the amazing work that was done by the contributor Hermit Daves, who created a repository with lots of lists of language terms from OpenSubtitles based on their frequency. With Pandas, we can fetch txt and csv files directly from Github without having to download them if we use the parameter on “?raw=true” at the end of Github’s URL.

import pandas as pd

url = 'https://github.com/hermitdave/FrequencyWords/blob/master/content/2018/es/es_50k.txt?raw=true'
df = pd.read_csv(url,header=None, sep=" ")
seed_terms = [x for x in df[0]]

2.- Making the requests to the API

In order to make the requests to the API we will iterate over the list of seed terms and we will extract the keyword ideas for each of them. Keep in mind, that before proceeding with this piece of code, you need to configure the Google Ads account and create the YAML file with the credentials as explained in the guide to Keyword Planner API. Also, you will need to check these two pages to get the location and language IDS to localize the keyword research: Geotargets and Language Codes.

In the case of our practical example, the location ID for the USA is 2840 and the Spanish language ID is 1003. The total number of terms from the seed terms list is 50.000, but if we do not need to extract so many keyword ideas, we can limit it. In the case below, I cap it to the first 1.000 terms. At the end of each iteration we let the script sleep for 3 seconds to avoid the API overloading.

from google.ads.googleads.client import GoogleAdsClient
import time

client = GoogleAdsClient.load_from_storage("<your-yaml-file-name>")

list_keywords = []
for x in seed_terms[0:1000]:
    list_keywords = list_keywords + main(client, "<your-client-id>", ["2840"], "1003", [x] , None)

3.- Exporting to Excel

Finally, we can export it as an Excel file with Pandas. The output will contain seven columns with the keyword, the average monthly search, the competition level, the competition index, the searches from the past months, the past months and the queries categorizations (Branded, Non-branded, Cars, Year, etcetera). Especial mentions to Alex Papageorgiou’s because he taught me how to also extract the keyword categories.

First, we will need to adjust a bit our list with the keywords to be able to export it with Pandas and then we will export it.

list_to_excel = []
for x in range(len(list_keywords)):
    list_months = []
    list_searches = []
    list_annotations = []
    for y in list_keywords[x].keyword_idea_metrics.monthly_search_volumes:
        list_months.append(str(y.month)[12::] + " - " + str(y.year))
    for y in list_keywords[x].keyword_annotations.concepts:
    list_to_excel.append([list_keywords[x].text, list_keywords[x].keyword_idea_metrics.avg_monthly_searches, str(list_keywords[x].keyword_idea_metrics.competition)[28::], list_keywords[x].keyword_idea_metrics.competition_index, list_searches, list_months, list_annotations ])
pd.DataFrame(list_to_excel, columns = ["Keyword", "Average Searches", "Competition Level", "Competition Index", "Searches Past Months", "Past Months", "List Annotations"]).to_excel('output.xlsx', header=True, index=False)

The final output of this exercise looks like:

Having the option of filtering by category is super convenient as we can spot terms related to your industry quite easily and fast by just filtering the Category column for the categories you are interested in the most. For instance, if I wanted to check queries about language related doubts I would only need to filter by “Language”.

When analyzing the file, you can notice that in the beginning most of the terms come from articles or connectors terms and the generated keywords are not super meaningful, however, once more specific terms are inputed, it throws much more insightful and meaningful keyword ideas.

That is all folks, I hope that you found this article interesting!