You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
The Better Business Bureau (BBB) is a private organization that provides the public with information on businesses and charities and rates them from A+ to F based on specific criteria. It also handles consumer complaints about firms.
This tutorial will show you how to scrape basic business information, such as name, BBB rating, and location, from bbb.org.
To follow through, you may want to use the URL below:
The main steps are shown in the menu on the right, and you can download the sample task file here.
1. Create a Go to Web Page - to open the target website
Enter the page URL on the home screen and click Start to create a new task
2. Save the cookies - to save necessary pre-settings before extracting the data
Turn on the Browser mode (from the upper right corner)
Close the cookies notification
Choose All Business
Click Go to Web Page in the workflow
Click Options > Tick Use cookies
Click Use cookie from the current page > Apply
3. Auto-detect the webpage - to create a workflow
Turn off the Browser mode
Choose Auto-detect web page data and wait for the detection to complete
Check the data fields in Data Preview and delete unwanted fields
Uncheck Add a page scroll and Click Create workflow
4. Adjust the timeout - to scrape stably
Click Click to Paginate in the workflow
Choose Options and Click Load with AJAX, and choose 5s
Click Apply
Click Extract data
Add 3s in the Wait before action setting
Click Apply
5. Run the task - to get your target data
Click Save on the upper right to save your task
Click Run next to it and wait for a Run Task window to pop up
Select Run on your device to run the task on your local device
Wait for the task to complete
Here is a sample output from a local run: