You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
eBay is a multinational e-commerce company located in the U.S., facilitating consumer-to-consumer and business-to-consumer sales through its website. It is one of the most famous and used e-commerce platforms worldwide.
This tutorial will show you how to scrape images' URLs from the eBay product detail page.
To follow through, you may want to use the URL below:
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Create a Go to Web Page - to open the target website
Enter the target URL on the homepage of Octoparse and click Start
2. Create a Pagination Loop - to scrape data from multiple listing pages
Scroll to the bottom of the webpage
Click the Next page button (->)
Click Loop click on the Tips panel
Set AJAX timeout : 7-10s recommended
Note: If you want to learn more about AJAX and how Octoparse handles it, please check it out here.
3. Create a Page Scroll down - to load the data on each page fully
Click the Add Step button (+) in the workflow > Loop
Set the Loop Mode as Scroll Page
Tick Scroll for one screen
Set Repeats time as 15
Click Apply
4. Create a Loop Item - to loop click on each product link and enter the detail page
Click on the name of the first product
Click Select all similar elements on the Tips
Choose Loop click each element
Click No
To make the loop more accurate, we need to modify the XPath of the Loop Item.
Click Loop Item 1
Set the Loop Mode as Variable List
Input the matching XPath as: //ul[@class="b-list__items_nofooter srp-results srp-grid"]/li//a[@class="s-item__link"]
Click Apply to save the change
5. Extract data - to extract the images' URLs
Click on the first image on the sidebar
Click Select all similar elements on the Tips
Click Text
Click Loop Item 2
Set the Loop Mode as Variable List
Input the matching XPath as: //div[@class='ux-image-filmstrip-carousel']/button/img
Click Apply to save the change
Click the More button next to the data field
Choose Customize field
Choose Select image URL(src attribute)
Choose Merge multiple rows of data into one
Note: This Merge multiple rows of data into one helps you to get all the images of one product into one cell. If you want to scrape them into different rows, you don't need to tick it. To scrape images into different columns, you may refer to Capture images from a carousel
6. Run the task - to get your target data
Click Save on the upper right to save your task
Click Run next to it and wait for a Run Task window to pop up
Select Run on your device to run the task on your local device
Wait for the task to complete
Here is the sample output from a local run: