All Collections
Case Tutorial
Travel
Scrape hotel details from Airbnb
Scrape hotel details from Airbnb
Updated over a week ago

You are browsing a tutorial guide for Octoparse's latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

Airbnb is a good website to find you a perfect vacation hotel. In this tutorial, we will help you learn how to use Octoparse to get hotel info from Airbnb.

The easiest way is to use pre-built task templates of Airbnb. You don't need to configure scraping tasks but just enter keywords/URLs to wait for the data. For further details, you may check it out here: Task Templates

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Go to Web Page - open the target website

  • Enter the URL on the home page and click Start


2. Set up a Loop Item and Pagination - to click each hotel link and paginate

  • Select the first block to detect all blocks

  • Click on Select all similar elements

  • Click on Loop click each URL to enter the detail page

  • Click Yes to create a Pagination

  • Click on Next page button

  • Scroll down to the end of the page to click on the "next page" icon and

    Confirm

The workflow created should look like this:

The next page is loaded with AJAX, so we need to add AJAX timeout to the Click to Paginate action.

  • Click on Click to Paginate

  • Go to the Options

  • Tick Load with AJAX

  • Set up the AJAX timeout as 5-10s

    fgfgfgfgf.gif

3. Modify the XPath of the Loop Item - to locate the items accurately

The auto-generated XPath does not always work well. In this case, we will need to modify the XPath of the Loop Item.

  • Click on Loop Item

  • Switch Loop Mode to Variable list

  • Enter XPath: //div[@data-testid="card-container"]/a

  • Click Apply to save

Note: XPath plays an important role in locating the correct element in Octoparse. To learn more about it, please refer to the following tutorial: What is XPath and how to use it in Octoparse


4. Extract data from the detail page

  • Click on Click Item to enter the detail page

  • Select any info you want and click on Text on the Tips panel

  • Select Add customer field -> Page-level data -> Page URL if you would like to pull the page URL from the current page

1196.gif
  • Double-click the data field to modify the name


5. Run your task - get the data you want

  • Click Run to run your task either on your device or in the cloud

  • Select Standard Mode under Run on your device section to run the task on your local device

  • Wait for the task to complete


Here is the sample output data, which can be exported in Excel, CSV, HTML and JSON formats.

air9.png
Did this answer your question?