Google Maps is not just a map website to help you find the location but also a rich database where you can gain lots of business insights. Many scrape Google Maps data to aggregate their business directory or build a business lead base.
This tutorial will guide you on how to get business information from Google Maps.
For Google Maps scraping, you can use our ready-to-use Task Template available on the home page or follow this tutorial to build the task from scratch.
With the template(s), you just need to enter a keyword (e.g., Accounting, NY) or a web page URL (e.g., https://www.google.com/maps/search/insurance+West+University+Place,+TX/@29.716598,-95.4987615,10z/data=!3m1!4b1) and then wait for the data to come out.
Here is the template data sample for your reference. To try out the template, you can apply for a 14-day premium trial to get started: Try Octoparse 14-day free premium trial!
If you want to learn how to set up the crawler on your own, you may continue with this tutorial.
Example URL: https://www.google.com/maps/search/insurance+West+University+Place,+TX/@29.716598,-95.4987615,10z/data=!3m1!4b1
We will scrape the data fields: Title, Review number, Review rating, Address, Phone, Website, Open time.
The main steps are shown in the menu on the right and you can download the demo task file here.
1. Go to Web Page - to open the targeted web page
Enter the example URL into the search bar and click Start
You can enter several URLs into the bar if you have many URLs to scrape.
2. Create a Loop Item with Partial Scroll - to load more results
Add a Loop Item to the workflow
Select Loop Mode as Scroll Page
Select Scroll Area as Patial Scroll
Input the XPath //a[@class="hfpxzc"]/../../..
Select Scroll for one screen
Set Repeats as 100 and Wait time as 1s
Click Apply to save
3. Create a Loop Item - to click on each result
Click on the first business block on the list
Click on the second business block
Choose Loop click each URL
Choose No
Go to the settings of the Loop Item1
Select Loop Mode as Variable List
Input the XPath //a[@class="hfpxzc"]
Click Apply to save
Go to settings of Click Item
Go to Options
Uncheck Open in a new tab
Set up AJAX timeout as 7s
Click Apply to save
4. Extract data - to select the data for extraction
Select the information you want, like business title, category, address, on the web page
Select Element data
Go to the settings of the Extract Data
Untick Extract data in the loop
Go to the Options
Set up wait time as 3s
Click Apply to save
Please note that Google is quite strict with data scraping and has a very hard-to-read source code, so we need to revise the element XPath for each data field to ensure precise scraping.
No worries! We have prepared everything you need. You can just use the element XPath provided below.
Go to the data preview and switch to Vertical View
Replace the default XPath with the revised one
XPaths for common fields are shown as below:
Title: //h1
Number of review: //span[contains(@aria-label,'review')]
Rating: //span[contains(@aria-label,'star')]/preceding-sibling::span
Category: //button[@jsaction="pane.rating.category"]
Address: //button[@data-item-id="address"]
Website: //a[contains(@aria-label,'Website')]
Phone number: //button[contains(@data-item-id,"phone")]
Open time: //div[contains(@aria-label,'Monday')]
Note: check out more on XPath: What is XPath and how to use it in Octoparse
5. Extract from page-level URL- to extract GPS coordinates (optional)
As many of you have requested, this step will teach you how to extract GPS coordinates data from Google Maps.
The coordinates are hidden in the page URL. So first, we need to extract the page URL in the loop.
Click the Add Custom Field icon in the data preview section
Select Page-level data and then Page URL
Next, we need to match out the coordinates from the page URL with RegEx tool.
Click the More button of the Page URL data field and select Clean data
Click + Add Step and then Match with Regular Expression
Try the RegEx tool if you don't want to write regular expressions yourself.
Input the following parameters and tick Match all
Check the "Results" box to see if the data is in our desired format
Click "Apply" to save the settings
6. Start extraction - to run the task and get data
Click Save to save the task
Click Run on the upper left side
Select Run task on your device to run the task on your computer
Here is the sample output.