All Collections
Case Tutorial
Lead Generation
Scrape reviews from Google Maps
Scrape reviews from Google Maps
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

As the king of navigation apps, Google Maps started out just offering an easy way to get directions from one place to another but has slowly evolved into an interactive global database overflowing with some of the most valuable business information available on the internet.

However, if you are a business owner wanting to extract reviews for businesses or places from Google Maps, you'll soon find that the official way of getting reviews through Google Places API is limited to 5 reviews, which is barely enough even for the simplest task. But with Octoparse, you can build your own crawler and scrape an unlimited number of reviews for businesses or places directly from Google Maps within minutes.

In this tutorial, we will guide you through the steps to design your own task workflow for Google Maps reviews.

For demonstration purposes, we will scrape Google Maps reviews for Tesla's Gigafactory 1. See the sample URL below:

The main steps are shown in the menu on the right. [Download demo task file here]


1. Create a Go to Web Page - to open the target web page

Every workflow in Octoparse starts by telling Octoparse a web page to start from.

  • Enter the sample URL into the search bar at the top of the home screen and click Start

If you have more than one URL, check this article to see how Octoparse handles a list of URLs.


2. Create a Click Item - to go to the "All reviews" page

  • Click on the "Reviews" button which will direct you to the review page and select the Click button to generate a Click Item action in your workflow

  • Set AJAX timeout to 15s or longer.

Now we have reached the page that hosts reviews.


3. Create a Loop item with Partial Scroll - to scroll down and load more reviews

You will find that the new page has multiple scroll bars and the reviews you want are inside a scrollable column on the left. The page won't load more reviews unless you scroll inside the left column, therefore we need to set up a loop Item with a partial scroll for our workflow to scroll and extract at the same time.

  • Add a Loop Item step to your workflow

  • Click on Loop Item, set loop mode to Scroll Page, and change the scroll area from Default to Partial.

  • Enter scroll area XPath to tell Octoparse where to scroll

  • Input the XPath //div[contains(@jsaction,'pane.review.out')]/../../..

Check out this article to embark on your journey to become an XPath master.

  • Choose scroll to the bottom of the page

  • Set scroll repeats (how many times you want to scroll)

  • Set a wait time (interval time between each scroll)

  • Click Apply to save your settings

7.png

Now we have successfully set up a partial scroll loop.


4. Extract Data in the Loop - to select the data for extraction

This step is quick and easy with Octoparse's innovative auto-detect function.

  • Click Auto-detect webpage data in the Tips panel

  • Wait for it to complete and click Create Workflow without ticking Add a page scroll since we've already created a scroll page in Step 3

Note: If for some reason the Auto-detect fails to detect the list, you may also select multiple similar elements on the web page to tell Octoparse the pattern for selection. Check out this article to see how to set up a list extraction manually.

  • Rename the data fields you want by double-clicking the field name after creating the workflow

  • Remove the ones you don't need by clicking on the More (three-dot icon) to delete the field after creating the workflow.

In this case, we want to extract the data like reviewer name, review date, review count, review content and the number of likes each review gets.

9.png
  • Make sure the loop item you create (it should be named Loop Item 1 by default) is put inside the previous loop item. If not, drag the Loop Item1 inside the Loop Item.


5. Clean the data fields - to refine the data

You may note that some data in the review count column have unwanted data "Local Guide ·" in front of them. Use Clean data to delete unwanted text.

  • Click on the three dots for more options for data fields

  • Click on Clean data

14.png
  • Click + Add Step and select the Replace option

15.png
  • Input "Local Guide · " in the Replace box and replace it with a blank (just leave the “With” box blank)

  • Click Evaluate to see if we have the desired result

  • Click Confirm to apply the change


6. Run the task - to get your target data

  • Click Save on the upper right to save your task

  • Click Run next to it and wait for a Run Task window to pop up

  • Select Standard Mode in the Run on your device column to run the task on your local device

  • Maximize the scraping window

  • Wait for the task to complete

Here is a sample output from a local run.

Did this answer your question?