You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
Example URL (Tesla Gigafactory):
https://www.google.com/maps/place/Tesla+Gigafactory/@39.5375591,-119.4412284,17z/data=!3m1!4b1!4m5!3m4!1s0x80991fc240ba30b9:0x7e66b0fa4fe55cd8!8m2!3d39.537555!4d-119.4390397?hl=en
What We'll Extract:
✔ Reviewer names
✔ Review dates
✔ Review content
✔ Star ratings
✔ Like counts
✔ Local Guide status
The main steps are shown in the menu on the right. [Download demo task file here]
1. Create the task
If you have more than one URL, check this article to see how Octoparse handles a list of URLs.
2. Create a Click Item - to go to the "All reviews" page
Click on the "Reviews" button which will direct you to the review page and select the Click button to generate a Click Item action in your workflow
Set AJAX timeout to 15s or longer.
Now we have reached the page that hosts reviews.
3. Configure Smart Scrolling
Add Loop Item with these settings:
Mode: Scroll Page
Scroll Area: Partial
XPath:
//div[contains(@jsaction,'pane.review.out')]/../../..
Scroll: to the bottom of the page
Repeats: 10-20 (adjust as needed)
Wait Time: 3-5 seconds between scrolls
Apply to save your settings
4. Extract Review Data
Option A: Auto-Detection (Recommended)
Click "Auto-detect webpage data"
Disable page scroll option (already configured) and Create workflow
Rename/delete fields as needed
Option B: Manual Selection
Select multiple review elements
Choose fields:
Reviewer:
//div[@class='d4r55']
Stars:
//span[@aria-hidden='true']
Date:
//span[@class='rsqaWe']
Content: //span[@class='wiI7pd']
Note: If for some reason the Auto-detect fails to detect the list, you may also select multiple similar elements on the web page to tell Octoparse the pattern for selection. Check out this article to see how to set up a list extraction manually.
Make sure the loop item you create (it should be named Loop Item 1 by default) is put inside the previous loop item. If not, drag the Loop Item1 inside the Loop Item.
5. Data Cleaning
For Local Guide text removal:
Evaluate to see if we have the desired result
Confirm to apply the change
6. Run & Export
Click Save on the upper right to save your task
Click Run next to it and wait for a Run Task window to pop up
Select Standard Mode in the Run on your device column to run the task on your local device
Maximize the scraping window
Wait for the task to complete
Here is a sample output from a local run.