All Collections
Case Tutorial
Travel
Scrape hotel information from Trip.com
Scrape hotel information from Trip.com
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!

With an extensive hotel network in 200 countries and regions, Trip.com helps customers make a comfortable and amazing choice of accommodation. Customers can find information such as the price, services, and reviews of a hotel.

This tutorial will introduce how to collect hotel information, such as hotel name, location, comments, price, and rating on Trip.com with Octoparse.

tripcom0007.jpg

To follow through, you might want to use the URL below:

The main steps are shown in the menu on the right and you can download the task file here.


1. Create a Go to Web Page - to open the target website

  • Enter the target URL into the search bar on the home screen and click Start.

Note: The way Trip.com deals with pagination is a little bit complicated. You need to scroll down several times then the Search More Hotels button will show up. Thus, we need to add a page scroll down at the beginning.

  • Click on Go to Webpage > Options

  • Tick Scroll down the page after it is loaded

  • Set the Scroll to repeat 10 times and Wait 2s for each scroll

  • Click Apply to save the settings

tripcom0000.jpg

2. Auto-detect web page data - to create a workflow

Octoparse's Auto-detection function can help you create a workflow quickly according to the design of the target website.

  • Click Auto-detect web page data on Tips and wait for the detection to complete

2022-05-27_15-43-35.jpg
  • Uncheck Paginate to scrape more pages and Add a Page Scroll

  • Click Create workflow

tripcom0005.jpg
  • Check the data fields in Data Preview

    • Delete unnecessary data fields directly by clicking More and Delete field

    • Modify the data field names by double-clicking the headers


3. Click the Load More button - to load more hotels

  • Scroll down to the bottom of the page until you see the Search More Properties button

  • Click Search More Properties > Loop click on the Tips panel


4. Modify the settings for Click to Paginate - to extract the new hotels' information

  • Click on Click to Paginate -> Options -> Load with AJAX

  • Set the timeout as 10s

  • Tick Scroll down the page after it is loaded -> Scroll to the bottom of the page

  • Set scroll repeats as 10 times, and wait time as 2s

  • Click Apply to save the settings

Note: When clicking on the Search More Hotels on Trip.com, it often takes a long time to finish loading, thus, we need to add a wait before action time before the page scroll.


5. Run the task - to get your target data

  • Click Run to run your task either on your device or in the cloud

  • Select Standard Mode under Run on your device section to run the task on your local device

  • Wait for the task to complete


Here is the sample data for your reference:

Did this answer your question?