Start Exploring Keyword Ideas

Use Serpstat to find the best keywords for your website

SEO, September 6, 2018 | 14324 2 | 11 min read – Read later

Using Scripts To Scrape SERPs For Different Results

Gary Stevens

Front end developer, full time blockchain geek and a volunteer working for the Ethereum foundation as well as an active Github contributor

Tell me if this has ever happened to you before?

You're in the middle of a project for yourself or a client. To successfully hit your deadline and meet your targets, you need access to detailed information that is housed deep in the underbelly of the search engine results page. But there's a problem…

For whatever reason (likely a draconian line of code in Google's latest update) you are unable to access or export the information you need.

Uh oh.

After 4 double espressos and 12 hours of tediously entering the data by hand (or paying a freelancer a multiple 4-figure sum), you finally finish the project moments before the deadline, completely and utterly exhausted.

If you're a seasoned SEO or webmaster, then I can all but guarantee that the above scenario has probably happened to you at least once.

And today, I'm going to show you how you can use Python to scrape the SERPs (or your website) for different results so that you can get the data you need… In 15 minutes or less.

Sound like a plan? Then let's get to it.

Why Scrape Using Python?

Despite the fact that more than 5.5 billion Google searches are submitted each day, the executive team behind the search engine seems hell bent on preventing everyday SEOs like you and me from accessing the data we need to successfully execute our campaigns.

Using Scripts To Scrape SERPs For Different Results 16261788164868

And I get it…

With the proliferation of black hat SEOs attempting to game the algorithm and pull one over on the system, big search engines like Google don't really have any other choice. But that doesn't make the state of search engines metrics in 2018 any easier to live with.

If you want to uncover key metrics behind your website's SEO performance (that aren't available through analytics), scan for security weaknesses, or gain competitive intelligence... the data is there for the taking. But you have to know how to get it.

Sure, you could drop $297/month on an expensive piece of software (that will become entirely obsolete within the next 12 months) or you could just use the strategy I'm about to share with you to find all of the data you need in a matter of minutes.

To do this, we'll be using a coding language called Python to scrape Google's SERPs so that you can quickly and easily gain access to the information you need without the hassle.

Let me show you how.

Basic Requirements

Before we can begin, you're going to need to have a few basic items and libraries installed to successfully scrape the SERPs.

Specifically, you'll need:

Python:

Splinter
Pandas

Google Chrome (duh!)

Chromedriver

For those of you who are lazy, er… prefer to work smart, I recommend that you simply uninstall your existing Python distribution and download this one from Anaconda. It comes prepackaged with the Panda library (among many others) making it an incredibly versatile and robust distribution package.

If you're happy with your existing python distribution, then simply plug the following lines of code into your terminal.

To Get Panda/ To get Splinter/ To get Splinter with Anaconda:

pip install pandas
pip install splinter
conda install splinter

With your libraries installed, it's time to get down to business and start scraping.

Step #1: Setup Your Libraries and Browser-ups in Links

The first thing you'll need to do once your libraries are installed is to import all libraries and setup your browser object.

For example:

from splinter import Browser
import pandas as pd
# open a browser
browser = Browser('chrome')

Note: If you want to scrape a 'responsive' web page, then you'll need to setup the parameter use set_window_size to make sure that all of the elements you need to access are properly displayed.

# Width, Height
browser.driver.set_window_size(640, 480)

Once the browser is successfully setup, it's time to visit Google. (I don't think you need a tutorial for this one).

Step #2: Exploring the Website

Alright, alright, alright… We've successfully made it to the front page and now, it's time to navigate the website.

Luckily, this process is pretty straightforward.

All you're going to do is:

Find an HTML element.

Perform an action on that element.

Finding an HTML element is simple. Simply right click on the website and select "Inspect" to open up the Google Chrome Developer tools. Then navigate to the upper left hand corner of the developer tools box and click on the "Inspect Icon".

Using Scripts To Scrape SERPs For Different Results 16261788164869

From here, you're going to use the inspector cursor (not to be confused with Inspector Gadget) to click on the section of the website which you want to control. In this case, the search bar.

Next, you're going to right click on the HTML element and select "Copy" > "Copy XPath".

And boom! You're ready to go.

Step #3 Control the Website

The XPath is the most important piece of this entire puzzle, so you'll want to be sure to keep it safe by pasting the following variable into Python.

search_bar_xpath = '//*[@id="lst-ib"]'

From here, we will pass this XPath to a method from the Splinter Browser Object.

find_by_xpath().

This will extract all of the elements that match the XPath you pass and return a comprehensive list of element objects.

Next, we'll want to gain navigation of the individual HTML element by using the following line of code:

search_bar_xpath = '//*[@id="lst-ib"]'
# index 0 to select from the list
search_bar = browser.find_by_xpath(search_bar_xpath)[0]

Finally, we will setup code to fill and click the search button.

search_bar.fill("serpstat.com")
# Now let's set up code to click the search button!
search_button_xpath = '//*[@id="tsf"]/div[2]/div[3]/center/input[1]'
search_button = browser.find_by_xpath(search_button_xpath)[0]
search_button.click()

The above lines of code will type serpstat.com into the search bar and then click the search button. Once this code is executed you'll be brought to the search engine results page and it's finally time to start scraping that data like old gum off of a shoe… or something like that.

Step #4: Sit Back and Watch the Magic Happen (Scrape Time!)

In this example, I'll show you how to scrape the titles and links for each of the websites on the first page of the SERPs.

Notice how each search result is contained within an h3 title tag with a class "r". It's also important to remember that the title and the link we want are both stored within an a-tag.

Ok, so the XPath of the highlighted tag is //*[@id="rso"]/div/div/div[1]/div/div/h3/a.

But it's only the first link. And much like the iconic rock band Queen, we want it all.

To get all of the links from the SERPs, we're going to shift things around a bit to ensure that our find_by_xpath code returns all of the results from the page.

Here's how:

search_results_xpath = '//h3[@class="r"]/a'  
search_results = browser.find_by_xpath(search_results_xpath)

What this code accomplishes is that the xpath tells Python to find all of the h3-tags with the "r" class and then extract the a-tag and data from each one.

Now, it's time to sit back, let the magic happen.

To extract the title and link for each search result, all you need to do is insert the following line of code into your Python terminal.

scraped_data = []
for search_result in search_results:
     title = search_result.text.encode('utf8') 
     link = search_result["href"]
     scraped_data.append((title, link))

And now, all of the titles and links have been scraped and submitted into the scraped_data list.

To export all of the data to csv you can simply use panda's dataframe using the following 2 lines of code:

df = pd.DataFrame(data=scraped_data, columns=["Title", "Link"])
df.to_csv("links.csv")

This simple little line of code will create a csv file with header Title, link, and all of the data that was just scraped.

Pretty cool, huh?

And that ladies and gentlemen, is how you scrape the Google SERPs using Python.

Final Thoughts

Website scraping is an invaluable web development skill that will allow you to take back control of your data and uncover many of the "secrets" that Google has hidden right below the surface.

By following the simple framework I outlined today, you can quickly and easily gain access to just about any information you need with a few lines of code and the click of a button (or mouse as it were).

So go out and scrape away!

Did you find this article helpful? Do you have any tips or tricks for improving your website scraping abilities? Comments, questions, concerns? Feel free to drop us a line below and let us know!

Speed up your search marketing growth with Serpstat!

Keyword and backlink opportunities, competitors' online strategy, daily rankings and SEO-related issues.

A pack of tools for reducing your time on SEO tasks.

Get free 7-day trial

Rate the article on a five-point scale

The article has already been rated by 5 people on average 4.4 out of 5

Found an error? Select it and press Ctrl + Enter to tell us

Discover More SEO Tools

Backlink Cheсker

Backlinks checking for any site. Increase the power of your backlink profile

API for SEO

Search big data and get results using SEO API

Competitor Website Analytics

Complete analysis of competitors' websites for SEO and PPC

Keyword Rank Checker

Google Keyword Rankings Checker - gain valuable insights into your website's search engine rankings