You are browsing a tutorial guide for Octoparse's latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
Airbnb is a good website to find you a perfect vacation hotel. In this tutorial, we will help you learn how to use Octoparse to get hotel info from Airbnb.
The easiest way is to use pre-built task templates of Airbnb. You don't need to configure scraping tasks but just enter keywords/URLs to wait for the data. For further details, you may check it out here: Task Templates
If you want to build the task from scratch, you can continue to read this tutorial. Here is the Airbnb room source link that we will be using as an example.
https://www.airbnb.com/s/New-York--NY--United-States/homes?adults=2&search_type=pagination&s_tag=A2EV74MC&tab_id=home_tab&refinement_paths%5B%5D=%2Fhomes&children=1&place_id=ChIJOwg_06VPwokRYv534QaPC8g&federated_search_session_id=2e7da092-4a51-48db-ba26-9746f41ac068
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Go to Web Page - open the target website
2. Set up a Loop Item and Pagination - to click each hotel link and paginate
Select the first block to detect all blocks
Click on Select all similar elements
Click on Loop click each URL to enter the detail page
Click Yes to create a Pagination
Click on Next page button
Scroll down to the end of the page to click on the "next page" icon and
Confirm
The workflow created should look like this:
The next page is loaded with AJAX, so we need to add AJAX timeout to the Click to Paginate action.
3. Modify the XPath of the Loop Item - to locate the items accurately
The auto-generated XPath does not always work well. In this case, we will need to modify the XPath of the Loop Item.
Click on Loop Item
Switch Loop Mode to Variable list
Enter XPath: //div[@data-testid="card-container"]/a
Click Apply to save
Note: XPath plays an important role in locating the correct element in Octoparse. To learn more about it, please refer to the following tutorial: What is XPath and how to use it in Octoparse
4. Extract data from the detail page
Click on Click Item to enter the detail page
Select any info you want and click on Text on the Tips panel
Select Add customer field -> Page-level data -> Page URL if you would like to pull the page URL from the current page
Double-click the data field to modify the name
5. Run your task - get the data you want
Click Run to run your task either on your device or in the cloud
Select Standard Mode under Run on your device section to run the task on your local device
Wait for the task to complete
Here is the sample output data, which can be exported in Excel, CSV, HTML and JSON formats.