Help Center

Scrape jobs from LinkedIn

Updated over 7 months ago

LinkedIn is a good database for finding valuable job information. In this tutorial, we are going to show you how to scrape job information from LinkedIn.com

To follow through, you may want to use the URL in the tutorial:

https://www.linkedin.com/jobs/search/?currentJobId=2011756127&geoId=105080838&keywords=accountant&location=New%20York%2C%20United%20States

We will scrape data such as job titles, companies, levels, types, functions, and industries in Octoparse.

The website applies an infinite scroll coupled with a "Show More" to load more reviews. After we scroll the page to the bottom like 6 times, a "show more" button would reveal, and if we want to continue to load jobs, we have to click on the button.

The main steps are shown in the menu on the right.

[Download the demo task click here]

1. "Go To Web Page" - to open the targeted web page

Enter the URL on the home page and click Start

2. Login to your LinkedIn account

Since Linkedin requires you to log in first before accessing the jobs, we need to log in and save the cookies.

Toggle on the Browse Mode

Manually log in to your LinkedIn account in Octoparse's built-in browser
Turn off Browse Mode
Go to Options
Tick Use cookie
Click on Use cookie from the current page

Click Apply to save

3. Set up scroll settings - to scroll down the page

Since the web page requires scrolling down to load more jobs, you need to set up scroll settings for the Go to Web Page action.

Tick Scroll down the page after it is loaded
Select Partial as Scroll Area
Input the XPath: //*[@id="main"]/div/div[2]/div[1]/div
Select Scroll for one screen
Set up scroll 30 times
Click Apply to save

4. Auto-detect web page - to create a workflow

You can use the auto-detect web page to scrape the list of jobs.

Choose Auto-detect web page data

Wait for the detection to complete
Uncheck Add a page scroll from the Tips panel
Click Create workflow

5. Click on each link - to get more detailed information

If you want to scrape job details from each job post, you need to click on each job URL to load the details page.

Click on the first job title
Select Click element

Set up the AJAX timeout as 10s

6. Extract data - to select the data for extraction

Click on any text information you want to extract from the page
Select Text or other data field you need on the Tips panel
Repeat the steps until you get all the data needed to be scraped

Edit the name of the data fields if needed

Uncheck the Extract data in the loop

Set up wait time as 7s

7. Set up a scroll for Click to Paginate

Click on Click to Paginate step

8. Run your task - to get the data you want

Click Save, and click Run on the upper right side
Select Run on your device to run the task on your computer

TIP: Please don't run the task in the Cloud since LinkedIn will fail to login if it detects suspicious IPs.

Here is the sample output.

Related Articles

Scrape articles from Medium

Scrape posts from LinkedIn

Scrape job info from Glassdoor

Scrape product information from Bukalapak

Scrape hotel info from Expedia