All Collections
Case Tutorial
Lead Generation
Scrape business information from Google Maps
Scrape business information from Google Maps
Updated over a week ago

Google Maps is not just a map website to help you find the location but also a rich database where you can gain lots of business insights. Many scrape Google Maps data to aggregate their business directory or build a business lead base.

This tutorial will guide you on how to get business information from Google Maps.

For Google Maps scraping, you can use our ready-to-use Task Template available on the home page or follow this tutorial to build the task from scratch.

With the template(s), you just need to enter a keyword (e.g., Accounting, NY) or a web page URL (e.g., https://www.google.com/maps/search/insurance+West+University+Place,+TX/@29.716598,-95.4987615,10z/data=!3m1!4b1) and then wait for the data to come out.

Here is the template data sample for your reference. To try out the template, you can apply for a 14-day premium trial to get started: Try Octoparse 14-day free premium trial!

If you want to learn how to set up the crawler on your own, you may continue with this tutorial.

We will scrape the data fields: Title, Review number, Review rating, Address, Phone, Website, Open time.

The main steps are shown in the menu on the right and you can download the demo task file here.


1. Go to Web Page - to open the targeted web page

  • Enter the example URL into the search bar and click Start

You can enter several URLs into the bar if you have many URLs to scrape.


2. Create a Loop Item with Partial Scroll - to load more results

  • Add a Loop Item to the workflow

  • Select Loop Mode as Scroll Page

  • Select Scroll Area as Patial Scroll

  • Input the XPath //a[@class="hfpxzc"]/../../..

  • Select Scroll for one screen

  • Set Repeats as 100 and Wait time as 1s

  • Click Apply to save


3. Create a Loop Item - to click on each result

  • Click on the first business block on the list

  • Click on the second business block

  • Choose Loop click each URL

  • Choose No

  • Go to the settings of the Loop Item1

  • Select Loop Mode as Variable List

  • Input the XPath //a[@class="hfpxzc"]

  • Click Apply to save

  • Go to settings of Click Item

  • Go to Options

  • Uncheck Open in a new tab

  • Set up AJAX timeout as 7s

  • Click Apply to save


4. Extract data - to select the data for extraction

  • Select the information you want, like business title, category, address, on the web page

  • Select Element data

  • Go to the settings of the Extract Data

  • Untick Extract data in the loop

  • Go to the Options

  • Set up wait time as 3s

  • Click Apply to save

Please note that Google is quite strict with data scraping and has a very hard-to-read source code, so we need to revise the element XPath for each data field to ensure precise scraping.

No worries! We have prepared everything you need. You can just use the element XPath provided below.

  • Go to the data preview and switch to Vertical View

  • Replace the default XPath with the revised one

XPaths for common fields are shown as below:

Title: //h1

Number of review: //span[contains(@aria-label,'review')]

Rating: //span[contains(@aria-label,'star')]/preceding-sibling::span

Category: //button[@jsaction="pane.rating.category"]

Address: //button[@data-item-id="address"]

Website: //a[contains(@aria-label,'Website')]

Phone number: //button[contains(@data-item-id,"phone")]

Open time: //div[contains(@aria-label,'Monday')]

Note: check out more on XPath: What is XPath and how to use it in Octoparse


5. Extract from page-level URL- to extract GPS coordinates (optional)

As many of you have requested, this step will teach you how to extract GPS coordinates data from Google Maps.

The coordinates are hidden in the page URL. So first, we need to extract the page URL in the loop.

  • Click the Add Custom Field icon in the data preview section

  • Select Page-level data and then Page URL

1.jpg

Next, we need to match out the coordinates from the page URL with RegEx tool.

  • Click the More button of the Page URL data field and select Clean data

  • Click + Add Step and then Match with Regular Expression

3.jpg

Try the RegEx tool if you don't want to write regular expressions yourself.

  • Input the following parameters and tick Match all

  • Check the "Results" box to see if the data is in our desired format

  • Click "Apply" to save the settings

4.jpg

6. Start extraction - to run the task and get data

  • Click Save to save the task

  • Click Run on the upper left side

  • Select Run task on your device to run the task on your computer

Here is the sample output.

mceclip3.png

Note: Google Maps does not show email addresses. If you want to get the business email address, please check our Email & social media links template.

Just input the business website URL you scrape from Google Maps into this template and you will get the email addresses.

Did this answer your question?