On today’s post I am going to show you a very useful and easy trick to scrape the SERPs with the Chrome Extension which is called Web Scraper. In order to use this Chrome Extension you do not need to know how to code as it works in a very intuitive way and you only need need to select those elements that you would like to scrape. We are going to use Web Scraper for SERPs scraping but it can actually be used to scrape any website once you know how to work with it. However, I must also say that if you would need to scrape the SERPs in a bulk mode for many queries, it is best using proxies as I explain in this article where I used Oxylabs SERPs API and Python for SERPs scraping.
Apart from scraping the actual rankings in the SERPs for some queries, this Chrome Extension together with some Google operators such as “site:”, “intitle”, “inurl”, etcetera can be very helpful to analyze the type of pages that are indexed from a website. In this post from Ahrefs you can find all these operators.
To illustrate how to make use of Web Scraper, I am going to scrape the indexed results with “site:” from some Oncrawl’s blog categories and I am going to export them as an Excel file. Let’s get this started!
1.- Preparing Google for the scraping
First of all, to avoid having many SERPs pages we will need to make some changes on Google’s settings. For that we need to go to https://www.google.com/ and in the lower right corner we will click on “Settings” and on “Search Settings”.
After this, we will go to the “Results per page” section, select up to 100 results per page and save it. In this way, Google will display 100 results per page instead of around 10 which is what it usually displays with the normal setup and this will make our scraping much faster and easier.
2.- Getting familiar with Web Scraper
Web Scraper can be added to our Chrome browser on the Chrome Web Store and once it is added, its tab will show up on the Developer tools but only if we dock it to the bottom like in the example which is below:
Now that we have already checked that the installation has been done successfully, we can start using the tool. In my case, I am going to scrape all the results which are indexed from https://www.oncrawl.com/technical-seo/ with: “site:https://www.oncrawl.com/technical-seo/”.
So I am going to create my first sitemap and I am going to add in the URL field the URL which is created by Google once I query “site:https://www.oncrawl.com/technical-seo/” on it.
3.- Creating the selectors for each element
Creating the selectors for each element is very easy. We need to give a title to the element and select the elements that we would like to scrape. As you can see in the screenshot which is below, you only need to click on two of the elements that you would like to scrape and the rest of the elements which share the same selection path will be autoselected:
We choose the type “Link” as we want to get the metatitle and the linked URLs and we tick the “Multiple” box to extract all the metatitles and not only one.
We also need to create another selector to move over the paginations:
Both the pagination and the metatitle selectors need to be triggered with the parent selectors “root” and “pagination”. The selectors graph looks like:
After this, we can initiate the scraping by going to the tab called “Scrape” on the dropdown menu and starting the scrape. It will open a new browser tab which will extract the metatitles and links from the SERP results.
Once it is finished, we can export all the data as a CSV file:
When exported to Excel, the final output looks like:
4.- Replicating the process
If we would like to extract the pages which are indexed from another directory, we could do it very easily exporting the sitemap which has been created and using it with another Google search URL. The sitemap should look like:
And the only change that is needed to use it with another URL for SERPs scraping is replacing the startUrl value with the new start URL, in my case I have inputted the URL which Google generates for the query “site:https://www.oncrawl.com/oncrawl-seo-thoughts/” as this time I am going to extract those indexed URLs from the SEO thoughts directory.
Once you have introduced your new start URL in the sitemap, you only need to import it and name it:
This time, as we already have the selectors in place, we only need to go to the Scrape tab and scrape it right away!
That’s all folks, I hope that this simple trick can make your life much easier!