All Collections
Case Tutorial
E-Commerce
Scrape product details from Amazon
Scrape product details from Amazon
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Amazon is one of the most popular e-commerce websites around the world. Many users try to scrape it to collect product information. In this tutorial, we are going to show you how to scrape product details from Amazon.

You can also go to "Task Templates" on the main screen of the Octoparse scraping tool and start with the ready-to-use Amazon Templates directly to save your time. Octoparse provides several Amazon templates designed for different countries such as Germany, France, the US, Spain, and India. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Templates

If you would like to know how to build the task from scratch, you may continue reading the following tutorial or check this video below.

To follow through, you may want to use this URL in the tutorial:

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Go to Web Page - to open the targeted web page

  • Enter the URL on the home page and click Start


2. Auto-detect the web page - to create the workflow

  • Click Auto-detect web page data and wait for the detection to complete

  • Uncheck the Add a page scroll

  • Click Create workflow

A Pagination and Loop Item would be generated automatically in the workflow.

5.png
  • Click more and Delete field to get rid of the unwanted data

  • Double-click to rename data fields

If all the data you need could be scraped from the listing page, you can stop here and jump to Set up AJAX timeout for "Click to Paginate". If you want to go to each product detail page to get more info, follow the steps below.


3. Click on each product link - to scrape more information

Click the second item on the page and choose Click element on the Tips panel

This is how the workflow should look like:

  • Click Click Item and paste the new XPath: //a[@class="a-link-normal s-no-outline"]

  • Click Apply


4. Extract Data - to extract data from the detail pages

  • Select information on the web page

  • Choose Text

  • Repeat the above steps to extract all the data you need


5. Set up AJAX timeout for "Click to Paginate"

  • Click open the settings of Click to Paginate

  • Go to Options

  • Tick Load with AJAX and select 10s as the AJAX timeout


6. Run extraction - run your task and get data

  • Click Save

  • Click Run on the upper left side

  • Select Run on your device to run the task on your computer, or select Run task in the Cloud to run the task in the Cloud (for premium users only)

Here is the sample output.

56156156.png
Did this answer your question?