Skip to main content

How to select a specific option from a drop-down list?

Updated over 9 months ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

Dropdown lists are common elements on web pages, and you might need to extract data from them. Whether you need to retrieve all options or just specific ones, this tutorial will guide you on how to select the options you need.

To quickly and accurately locate an option, writing the correct XPath is essential. We'll walk through an example to show you how.

You may want to use this example link to follow through:

Here is the dropdown list and it contains a lot of options.

Let's loop through all the options in the dropdown menu first.

  • Click on the dropdown menu and choose Loop through the options in the dropdown menu on the Tips panel

Click on the Loop item and you will notice that the default XPath is //select[@id="brand"]/OPTION

As you can see, there are 65 items inside the dropdown list.

To select a specific option that meets your requirements, you will need to update the XPath for Loop Item.


Choose a specific option by its index

For example, if you want to select the 4th option, which is "Audi", the correct XPath should be :

//select[@id="brand"]/OPTION[4]

Simply add [X] to the end of the XPath to select the option you want. If you replace the default XPath with the new one, the 4th option will appear.


Choose a specific option by its text

If you want to select all options containing "A," the correct XPath should be:

//select[@id="brand"]/OPTION[contains(text(),'A')]

Using "contains" can help you select the option containing specific text.


Choose a specific option by its position

If we want to select all the options except the first 2 options, the correct XPath should be:

//select[@id="brand"]/OPTION[position()>2]

By using '>', '=', or '<' after 'position()', you can adjust the XPath to fit your needs.

If you want to select the last option only, the right XPath should be:

//select[@id="brand"]/OPTION[last()]

If you want to check whether the XPath you modified works well or not in Octoparse, you need to click Apply to save first, click another action in the workflow, and then click Loop Item again.

_1.gif
Did this answer your question?