This site uses cookies and other tracking technologies to make possible your usage of the website, assist with navigation and your ability to provide feedback, analyse your use of our products and services, assist with our promotional and marketing efforts, and provide better user experience.

By using the website, you agree to our Privacy policy

Accept and continue

Report a bug

Cancel
3433
SEO 11 min read September 6, 2018

Using Scripts To Scrape SERPs For Different Results

Using Scripts To Scrape SERPs For Different Results

Gary Stevens
Front end developer, full time blockchain geek and a volunteer working for the Ethereum foundation as well as an active Github contributor
Tell me if this has ever happened to you before?

You're in the middle of a project for yourself or a client. To successfully hit your deadline and meet your targets, you need access to detailed information that is housed deep in the underbelly of the search engine results page. But there's a problem…

For whatever reason (likely a draconian line of code in Google's latest update) you are unable to access or export the information you need.

Uh oh.
After 4 double espressos and 12 hours of tediously entering the data by hand (or paying a freelancer a multiple 4-figure sum), you finally finish the project moments before the deadline, completely and utterly exhausted.

If you're a seasoned SEO or webmaster, then I can all but guarantee that the above scenario has probably happened to you at least once.

And today, I'm going to show you how you can use Python to scrape the SERPs (or your website) for different results so that you can get the data you need… In 15 minutes or less.

Sound like a plan? Then let's get to it.

Why Scrape Using Python?

Despite the fact that more than 5.5 billion Google searches are submitted each day, the executive team behind the search engine seems hell bent on preventing everyday SEOs like you and me from accessing the data we need to successfully execute our campaigns.
And I get it…

With the proliferation of black hat SEOs attempting to game the algorithm and pull one over on the system, big search engines like Google don't really have any other choice. But that doesn't make the state of search engines metrics in 2018 any easier to live with.

If you want to uncover key metrics behind your website's SEO performance (that aren't available through analytics), scan for security weaknesses, or gain competitive intelligence... the data is there for the taking. But you have to know how to get it.

Sure, you could drop $297/month on an expensive piece of software (that will become entirely obsolete within the next 12 months) or you could just use the strategy I'm about to share with you to find all of the data you need in a matter of minutes.

To do this, we'll be using a coding language called Python to scrape Google's SERPs so that you can quickly and easily gain access to the information you need without the hassle.

Let me show you how.

Basic Requirements

Before we can begin, you're going to need to have a few basic items and libraries installed to successfully scrape the SERPs.

Specifically, you'll need:

  • Python:
Splinter
Pandas

  • Google Chrome (duh!)
  • Chromedriver

For those of you who are lazy, er… prefer to work smart, I recommend that you simply uninstall your existing Python distribution and download this one from Anaconda. It comes prepackaged with the Panda library (among many others) making it an incredibly versatile and robust distribution package.

If you're happy with your existing python distribution, then simply plug the following lines of code into your terminal.

To Get Panda/ To get Splinter/ To get Splinter with Anaconda:
pip install pandas
pip install splinter
conda install splinter
With your libraries installed, it's time to get down to business and start scraping.

Step #1: Setup Your Libraries and Browser-ups in Links

The first thing you'll need to do once your libraries are installed is to import all libraries and setup your browser object.

For example:

from splinter import Browser
import pandas as pd
# open a browser
browser = Browser('chrome')
Note: If you want to scrape a 'responsive' web page, then you'll need to setup the parameter use set_window_size to make sure that all of the elements you need to access are properly displayed.
# Width, Height
browser.driver.set_window_size(640, 480) 
Once the browser is successfully setup, it's time to visit Google. (I don't think you need a tutorial for this one).

Step #2: Exploring the Website

Alright, alright, alright… We've successfully made it to the front page and now, it's time to navigate the website.

Luckily, this process is pretty straightforward.

All you're going to do is:
1
Find an HTML element.
2
Perform an action on that element.
Finding an HTML element is simple. Simply right click on the website and select "Inspect" to open up the Google Chrome Developer tools. Then navigate to the upper left hand corner of the developer tools box and click on the "Inspect Icon".
From here, you're going to use the inspector cursor (not to be confused with Inspector Gadget) to click on the section of the website which you want to control. In this case, the search bar.

Next, you're going to right click on the HTML element and select "Copy" > "Copy XPath".

And boom! You're ready to go.

Step #3 Control the Website

The XPath is the most important piece of this entire puzzle, so you'll want to be sure to keep it safe by pasting the following variable into Python.
search_bar_xpath = '//*[@id="lst-ib"]'
From here, we will pass this XPath to a method from the Splinter Browser Object.
find_by_xpath(). 
This will extract all of the elements that match the XPath you pass and return a comprehensive list of element objects.

Next, we'll want to gain navigation of the individual HTML element by using the following line of code:
search_bar_xpath = '//*[@id="lst-ib"]'
# index 0 to select from the list
search_bar = browser.find_by_xpath(search_bar_xpath)[0]
Finally, we will setup code to fill and click the search button.
search_bar.fill("serpstat.com")
# Now let's set up code to click the search button!
search_button_xpath = '//*[@id="tsf"]/div[2]/div[3]/center/input[1]'
search_button = browser.find_by_xpath(search_button_xpath)[0]
search_button.click()
The above lines of code will type serpstat.com into the search bar and then click the search button. Once this code is executed you'll be brought to the search engine results page and it's finally time to start scraping that data like old gum off of a shoe… or something like that.

    Step #4: Sit Back and Watch the Magic Happen (Scrape Time!)

    In this example, I'll show you how to scrape the titles and links for each of the websites on the first page of the SERPs.

    Notice how each search result is contained within an h3 title tag with a class "r". It's also important to remember that the title and the link we want are both stored within an a-tag.

    Ok, so the XPath of the highlighted tag is //*[@id="rso"]/div/div/div[1]/div/div/h3/a.

    But it's only the first link. And much like the iconic rock band Queen, we want it all.

    To get all of the links from the SERPs, we're going to shift things around a bit to ensure that our find_by_xpath code returns all of the results from the page.

    Here's how:
      search_results_xpath = '//h3[@class="r"]/a'  
      search_results = browser.find_by_xpath(search_results_xpath)
      What this code accomplishes is that the xpath tells Python to find all of the h3-tags with the "r" class and then extract the a-tag and data from each one.

      Now, it's time to sit back, let the magic happen.

      To extract the title and link for each search result, all you need to do is insert the following line of code into your Python terminal.
        scraped_data = []
        for search_result in search_results:
             title = search_result.text.encode('utf8') 
             link = search_result["href"]
             scraped_data.append((title, link)) 
        And now, all of the titles and links have been scraped and submitted into the scraped_data list.

        To export all of the data to csv you can simply use panda's dataframe using the following 2 lines of code:
        df = pd.DataFrame(data=scraped_data, columns=["Title", "Link"])
        df.to_csv("links.csv")
        This simple little line of code will create a csv file with header Title, link, and all of the data that was just scraped.

        Pretty cool, huh?

        And that ladies and gentlemen, is how you scrape the Google SERPs using Python.

        Final Thoughts

        Website scraping is an invaluable web development skill that will allow you to take back control of your data and uncover many of the "secrets" that Google has hidden right below the surface.

        By following the simple framework I outlined today, you can quickly and easily gain access to just about any information you need with a few lines of code and the click of a button (or mouse as it were).

        So go out and scrape away!

        Did you find this article helpful? Do you have any tips or tricks for improving your website scraping abilities? Comments, questions, concerns? Feel free to drop us a line below and let us know!

        Rate the article on a five-point scale

        The article has already been rated by 2 people on average 4.5 out of 5
        Found an error? Select it and press Ctrl + Enter to tell us

        Recommended posts

        Subscribe to our newsletter
        Keep up to date with our latest news, events and blog posts!

        Поделитесь статьей с вашими друзьями

        Sign In Free Sign Up

        You’ve reached your query limit.

        Or email
        Forgot password?
        Or email
        Back To Login

        Don’t worry! Just fill in your email and we’ll send over your password.

        Are you sure?

        Awesome!

        To complete your registration you need to enter your phone number

        Back

        We sent confirmation code to your phone number

        Your phone Resend code Queries left

        Something went wrong.

        Contact our support team
        Or confirm the registration using the Telegram bot Follow this link
        Please pick the project to work on

        Personal demonstration

        Serpstat is all about saving time, and we want to save yours! One of our specialists will contact you and discuss options going forward.

        These may include a personal demonstration, a trial period, comprehensive training articles & webinar recordings, and custom advice from a Serpstat specialist. It is our goal to make you feel comfortable while using Serpstat.

        Name
        Email
        Phone
        We are glad of your comment

        Upgrade your plan

        Upgrade your plan

        Export is not available for your account. Please upgrade to Plan A or higher to get access to the tool. Learn more

        Sign Up Free

        Спасибо, мы с вами свяжемся в ближайшее время

        Invite
        View Editing

        E-mail
        Message
        Optional
        E-mail
        Message
        Optional

        You have run out of limits

        You have reached the limit for the number of created projects. You cannot create new projects unless you increase the limits or delete existing projects.

        I want more limits

        Christmas is a time for miracles.

        You are almost on the finish line of our Christmas quest. The last brick of your lego-promocode is left on the way up 55% discount.

        Did not find previous lego-bricks? Fill the form anyway.

        Name
        Email
        Phone