If the data you need is not visible when you've opted for Auto-detect web page data, you will get the Not the right webpage? after the auto-detect is finished. You can then click on the options provided to proceed with the task set-up.
1. Log in to the website
Upon clicking, Octoparse will switch on the Browse Mode for you to enter the login credentials. You can then type in the username and passwords just like how it is done in a normal browser. Once you've successfully logged into the account, click Done.
Cookies are then saved automatically to the task and used for future access. Please note that Octoparse does not keep or save your login credentials and no login steps will be generated and added to the workflow in this case.
Now you've logged in to the account, you can move on to scrape your target data manually or run for an auto-detection once again.
2. Close the pop-up window
Some of the websites might have a pop-up window when you open them in Octoparse. Even though the pop-ups won't necessarily mess up the extraction, they do get in the way of setting up the task. Follow the steps below for closing it.
Select the Close a pop-up option
Click the Close button on the pop-up window or any other element that does the same thing. In the example below, click the ACCEPT button to continue.
Click Confirm to finish up.
Octoparse will ask you if you'd like to adjust the timeout for AJAX. (see more in Deal with AJAX). Follow the instruction on the AJAX Setup panel if needed.
3. Search with keyword(s)
If you are scraping any kind of directory website, chances are you may need to search with keyword(s) in order to access the information you need. Follow the instructions below to run a search prior to scraping the data.
Select the Search with keyword(s) option
Click Settings(1) to add a search box then click on the search box on the webpage then Confirm
Click the edit button(2) to add search keyword(s)
Enter one keyword per line and then Confirm
Depending on whether there's a "search" button on the page, you can either choose to "Hit the Enter/Return key when finish entering" or "Click the search button when finishing entering". For the latter, make sure you've clicked on "Setting" and selected the correct "Search" button.
Confirm to continue
TIPS: Learn more about how to deal with text/keyword input: |
4. Switch tab
To scrape data from inside a tab, follow the instructions below.
Taking the screenshot above as an example, here's how to get the data under the "SPECS" tab.
Select the Switch tab option
Follow the guide on the Tips panel to click the tab to show the data
Confirm to continue