All Collections
Case Tutorial
E-Commerce
Scrape product reviews from Amazon
Scrape product reviews from Amazon
Updated over a week ago

You are browsing a tutorial guide for Octoparse's latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Product reviews are a good resource for improving your product performance. In this tutorial, we will show you how to scrape product reviews from Amazon.com.

For Amazon product scraping, you could use our ready-to-use Template available on the home page or follow this tutorial to build the task from scratch.


The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create a Go to Web Page - to open the target web page

  • Paste the URL and click Start

You will see a Go to Web Page step created in the workflow

  • Go to the settings of the Go to Web Page ->Options

  • Tick Use cookie

  • Click Use cookie from the current page

  • Click Apply to save


2. Create a Click Item - to see all reviews

  • Scroll down the page to find the See all reviews button

  • Click on it and choose Click URL


3. Auto-detect webpage - to create the workflow

  • Select Auto-detect web page data

mceclip4.png
  • Wait for the detection complete ->uncheck Add a page scroll -> Create workflow

mceclip6.png

4. Adjust AJAX timeout for Pagination

  • Click on Click to Paginate - adjust Timeout to 10s

  • Click Apply


5. Check the data and workflow

  • Go to Data Preview to check the current data output. Double-click on the header to rename it, or click "..." to delete a data field

Here's what the final workflow looks like. Once everything is in place, you can continue to run the task

mceclip0.png

6. Run task to extract data

  • Click Run on the top right corner

  • Click Run on your device to run the task on your local device, or select Run in the Cloud to run the task in the Cloud (for premium users only)

Here is the sample output:

Did this answer your question?