Kijiji is a Canadian online classified advertising website and part of eBay Classified Group.
This tutorial will show you how to scrape car information from Kijiji.
To follow through, you may want to use this URL in the tutorial:
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Go to Web Page - to open the target web page
2. Create a "Loop Click Item" - loop click into each item on each list
Click on the first item card
Click on the second item card
Click on Loop click each URL on the Tips panel
Click Yes to create pagination
Select Next page button
Scroll down the page and click on the next page button
Click Confirm
3. Modify Xpath for Loop Item - to locate all the items
After setting the Loop for the item cards, some items failed to be included in the Loop. We need to modify the Xpath to locate all the items manually.
Click Loop Item
Choose Loop Mode as Variable List
Input Xpath as //a[@data-testid="listing-link"]
Click Apply
Click on the Click Item step and Octoparse will open the car detail page
4. Set up Click Item - to show detailed info
Detailed descriptions have been hidden on the detailed page, so we need to click the "Show more" button to load the information fully.
5. Extract Data - to select the data you want
Repeat the process until you get all the information you want
Double-click the data field if you need to rename it
6. Modify XPath for data fields - to locate elements accurately on each detailed page
If there is a missing data collection or field misplacement, we need to rewrite the XPath to ensure the elements are located for every detailed page.
Go to the Data Preview panel
Switch to Vertical View by clicking the upper right-corner icon
Input Xpath for the fields
Please find XPath for each data field below:
Product name: //h1[@itemprop="name"]
Price: //span[@itemprop="price"]
IMG_URL: //div[contains(@class,'backgroundImage')]//img
Address://a[contains(@class,"location")]
Transmission: //span[contains(text(),"Transmission")]/following-sibling::span
Fuel Type: //span[contains(text(),"Fuel Type")]/following-sibling::span
Stock: //span[contains(text(),"Stock")]/following-sibling::span
Drivetrain: //span[contains(text(),"Drivetrain")]/following-sibling::span
Body Type: //span[contains(text(),"Body Type")]/following-sibling::span
Description: //div[@itemprop="description"]
Tip: To know more about how to write Xpath, please refer to this tutorial:
The final workflow should look like this:
7. Run the task - to get the desired Data
Click Run to run your task either on your device or in the cloud
Select Standard Mode under Run on your device section to run the task on your local device
Wait for the task to complete
Here is the sample output data, which can be exported in Excel, CSV, HTML and JSON formats.
Tip: Local runs are great for quick runs and small amounts of data. If you are dealing with more complicated tasks or mass of data, Run in the Cloud is recommended for higher speed. You are welcome to try the premium feature by signing up for the 14-day free trial here. Tasks can be scheduled hourly, daily, or weekly, and data delivered regularly.