You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
Bukalapak is an Indonesian E-commerce company. It enables small and medium enterprises to go online, and it also supports traditional family-owned businesses.
You can go to "Template Gallery" on the sidebar of the Octoparse home page and start with the ready-to-use Bukalapak Templates directly to save your time. With this feature, there is no need to configure scraping tasks. For further details, you may check it out here: Task Templates
This tutorial will show you how to collect product details on bukalapak.com with Octoparse.
To follow through, you may want to use this URL in the tutorial:
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Create a Go to Web Page - to open the target website
Enter the page URL on the home screen and click Start to create a new task
Click on Go to Web Page > Options
Tick Scroll down the page after it is loaded
Set Scroll as for one screen > Repeat times as 12
Click Apply to save
2. Auto-detect the webpage - to create a workflow
Click Auto-detect web page data and wait for the detection to complete
Check the data fields in Data Preview and delete unwanted fields
Uncheck Add a page scroll
Click Create workflow
Rename them after creating workflow if needed
3. Modify the XPath of Loop Item - to locate the data field(s) more accurately
Choose Loop Item in the workflow
Input the Matching XPath as: //div[@class="bl-flex-container flex-wrap is-gutter-16"]/div
Click Apply to save
4. Modify the settings of Pagination - to fully load the content on the webpage
Click Click to paginate in the workflow > Click Options
Tick Scroll down the page after it is loaded
Set scroll as to the bottom of the page
Set scroll times as 12
5. Run the task - to get your target data
Click Save on the upper right to save your task
Click Run next to it and wait for a Run Task window to pop up
Select Run on your device to run the task on your local device
Wait for the task to complete
Here is a sample output from a local run: