You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
With an extensive hotel network in 200 countries and regions, Trip.com helps customers make a comfortable and amazing choice of accommodation. Customers can find information such as the price, services, and reviews of a hotel.
This tutorial will introduce how to collect hotel information, such as hotel name, location, comments, price, and rating on Trip.com with Octoparse.
To follow through, you might want to use the URL below:
The main steps are shown in the menu on the right and you can download the task file here.
1. Create a Go to Web Page - to open the target website
Note: The way Trip.com deals with pagination is a little bit complicated. You need to scroll down several times then the Search More Hotels button will show up. Thus, we need to add a page scroll down at the beginning.
Click on Go to Webpage > Options
Tick Scroll down the page after it is loaded
Set the Scroll to repeat 10 times and Wait 2s for each scroll
Click Apply to save the settings
2. Auto-detect web page data - to create a workflow
Octoparse's Auto-detection function can help you create a workflow quickly according to the design of the target website.
Click Auto-detect web page data on Tips and wait for the detection to complete
Uncheck Paginate to scrape more pages and Add a Page Scroll
Click Create workflow
Check the data fields in Data Preview
3. Click the Load More button - to load more hotels
Scroll down to the bottom of the page until you see the Search More Properties button
Click Search More Properties > Loop click on the Tips panel
4. Modify the settings for Click to Paginate - to extract the new hotels' information
Click on Click to Paginate -> Options -> Load with AJAX
Set the timeout as 10s
Tick Scroll down the page after it is loaded -> Scroll to the bottom of the page
Set scroll repeats as 10 times, and wait time as 2s
Click Apply to save the settings
Note: When clicking on the Search More Hotels on Trip.com, it often takes a long time to finish loading, thus, we need to add a wait before action time before the page scroll.
5. Run the task - to get your target data
Click Run to run your task either on your device or in the cloud
Select Standard Mode under Run on your device section to run the task on your local device
Wait for the task to complete
Here is the sample data for your reference: