All Collections
Case Tutorial
Search Engine
Scrape list information from Bing
Scrape list information from Bing
Updated over a week ago

You are browsing a tutorial guide for Octoparse's latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

Bing is one of the most popular search engines around the world. In this tutorial, we are going to show you how to scrape result information from Bing.com.

To follow through, you may want to use this URL in the tutorial:

We will scrape data such as the title, URL, and description from the search results list with Octoparse.

The main steps are shown in the menu on the right and you can download the demo task here.


1. Create a Go to Web Page - to open the target web page

  • Enter the example URL and click Start


2. Create a pagination - to scrape multiple listing pages

  • Scroll down and click the ">" button on the web page

  • Click Loop click on the Tips panel

  • Set up AJAX timeout 7s


3. Extract data - to scrape certain elements from each page

Let's start with the 2nd non-ad item on the search result list.

  • Click on the 2nd non-ad item title on the page

  • Click Select all similar elements on the Tips panel

  • Choose Text on the Tips panel

  • Click on the title of the second item

  • Choose Link on the Tips panel

  • If you need the description, click on the text and then choose Text

  • You can also add some predefined data fields from the "+" icon. I choose the Current date & time to have the extracted time

5.png
  • Double-click on the field name to rename it if needed

6.png

The workflow created should look like this:


4. Modify the XPath - to locate the data precisely

Here we found that some ads are still included in our loop, but we don't need the ads. Therefore, we would need to modify the XPath.

  • Click on the Loop Item and change the XPath to //li[@class='b_algo']

  • Click Apply to save

7.png

XPath for the data fields also needs to be modified.

  • Switch the Data Preview to Vertical View

  • Modify the XPath of the fields as follows

Title: //h2

Title URL: //h2/a

Description: //p

XPath.jpg

Here are some related tutorials you might need:


5. Run the task - to get your target data

  • Click Run to run your task either on your device or in the cloud

  • Select Standard Mode under Run on your device section to run the task on your local device

  • Wait for the task to complete

    Here is the sample output data, which can be exported in Excel, CSV, HTML and JSON formats.

Did this answer your question?