All Collections
Case Tutorial
Jobs
Scrape company info from Goodfirms.co
Scrape company info from Goodfirms.co
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

GoodFirms is a research and review platform that helps software buyers and service seekers opt for the best software or firm. At the same time, it helps IT companies and software vendors to boost user acquisition stats, market share, and brand awareness.

In four steps, this tutorial will show you how to scrape company info, such as company name, location, website, etc., from Goodfirms.

To follow through, you may want to use the URL below:

The main steps are shown in the menu on the right, and you can download the sample task file here.


1. Create a Go to Webpage - to open the target website

  • Enter the page URL on the home screen and click Start to create a new task


2. Auto-detect the webpage - to create a workflow

  • Choose Auto-detect webpage data and wait for the detection to complete

autodetect.jpg
  • Uncheck Add a page scroll

  • Click Create workflow

createworkflow.jpg
  • Check the data fields in Data Preview and delete unwanted fields or rename them if needed (double-click to rename)


3. Modify Pagination settings - to locate the pagination button accurately

  • Click on the Pagination box

  • Replace the auto-generated Matching XPath with: //a[@title="Next Page"]

  • Click Apply to save the change


NOTE: To learn more about XPath in Octoparse, please check: What is XPath and how to use it in Octoparse?

  • Click on Click to Paginate box in the workflow

  • Select the Options

  • Tick Load with AJAX > set the AJAX timeout (7-10s recommended)

ajax.png

Tip: Why do you need to set up AJAX timeout? Check it out here: Handling AJAX


4. Run the task - to get your desired data

  • Click Save on the upper right side to save your task

  • Click Run next to it and wait for the Run Task window to pop up

  • Select Run on your device to run the task on your local device

  • Wait for the task to complete

Here is a sample output from a local run:

Did this answer your question?