All Collections
Case Tutorial
Other
Scrape product data from Canadian Tire
Scrape product data from Canadian Tire
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Canadian Tire is a Canadian retail company that operates in the automotive, hardware, sports, leisure, and housewares sectors.

In this tutorial, we will show you how to collect product information on canadiantire.com with Octoparse.

1.png

To follow through, you may want to use this URL:

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create Go to Web Page - to open the target page

  • Enter the URL on the home page and click Start

Note: If you see any pop-ups on the web page, please switch to Browse mode to close it manually. Remember to turn off Browse mode after you close the pop-up.


2. Auto-detect the web page - to create the workflow

  • Click on Auto-detect web page data and wait for the detection to complete

  • Untick Add a page scroll and Click Create workflow

Untick_page_scroll.png

The workflow would look below:

workflow.jpg
  • Check the data fields in Data Preview and delete unwanted fields or rename them if needed

    • Delete unnecessary data fields directly by clicking More and Delete field

    • Modify the data field names by double-clicking the headers


3. Modify the XPath of the data field - to locate elements accurately

In this case, the XPath of price per tire failed to pick up all the data. We need to manually update its XPath.

  • Turn the Data Preview panel into a Vertical View

  • Input Xpath for price field:

    //div[@class="nl-price--charge"]//span[contains(text(),'Each')]/..


4. Set up Pagination Loop - to scrape data from multiple listing pages

  • Click the next page button

Click Loop click in the tip box after the button turns green

After the pagination is created, the workflow should look like this:

Since the auto-generated Path of Pagination is not working properly, so we need to update it.

  • Click on Pagination

  • Paste the updated XPath //a[@rel="next"]

  • Click on Apply


5. Set scroll settings - to fully load images

  • Click Go to Web Page

  • Click Options

  • Tick Scroll down the page after it is loaded

  • Select Scroll for one screen

  • Wait 2s

  • Scroll 50times

  • Click Apply

SCROLL.png
  • Click Click to paginate in the workflow

  • Click Options

  • Tick Scroll down the page after it is loaded

  • Select Scroll for one screen

  • Wait 2s

  • Scroll 50times

  • Click Apply

CLICK_TO.png


6. Run the task - to get your target data

  • Click Run to run your task either on your device or in the cloud

  • Select Standard Mode under Run on your device section to run the task on your local device

  • Wait for the task to complete

Here is the sample output data, which can be exported in Excel, CSV, HTML and JSON formats.

Did this answer your question?