You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
AliExpress is an online retail service based in China and owned by the Alibaba Group. In this tutorial, we are going to show you how to scrape seller info from AliExpress.
If you would like to know how to build the task from scratch, you may continue reading the following tutorial or check this video below.
To demonstrate, we will use this URL as an example: https://www.aliexpress.com/premium/iphone-case.html?d=y&origin=y&catId=0&initiative_id=SB_20211008223741&SearchText=iphone%20case
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Create a Go to Web Page - to open the target website
Paste the URL on the home page and click Start
AliExpress requires a login to see the product information, so you need to log in to your AliExpress account first.
Switch to Browse mode
Fill in your login info and click sign in.
After logging into your account, turn off the Browse mode.
Click Go to Webpage -> Options
Tick Use Cookie and click Use cookie from the current page
Tick the Scroll down the page after it is loaded box, and change the Repeat number to 10.
Click on Apply to save the changes
2. Set up a Pagination Loop - to scrape data from multiple pages
Click on Next then select Loop click next page
Click on Click to Paginate -> Options, update AJAX timeout to 15s
Tick the Scroll down the page after it is loaded box then adjust the number of Repeats to 10
Click Apply to save the changes
3. Set up a Loop Item - to loop click on each product link and enter the detail page
Click on 2 random store names on the page and click Loop click each URL
4. Create Hover On - to show the details of sellers
Click on the store name then select Hover on the selected element
5. Extract Data - to get the information needed
Extract all the info you need by clicking on it, then select Text
Repeat until you have all the info needed. In this case, we will extract the store name/ store no./ item as described/ communication/ shipping speed.
Here we need to change the Xpaths of the store number and the open date of the store.
Switch to the Vertical View
Change the XPath under Field Settings
Store number: //span[contains(.,'Store No.')]
Open date: //span[.='This store has been open since ']/following-sibling::span
Double-click on the field names to rename
6. Run the task - to get your target data
Below is how the final workflow looks like. If everything is in place, you can continue to run the task
Click Run on the top right corner: Run task on your device to run the task on your local device, or select Run task in the cloud to run the task on the Cloud (for premium users only)
Here is the sample output -