Many product web pages use image carousels (like the one below) to display multiple images as slides which you can usually flip through manually. In this tutorial, I will show you how to extract the images of a carousel into your desired format.
You may need this link to follow through:
1. Scape one image into one column
Simply select one of the images, and select Image URL on the Tips Panel. Repeat the same process to fetch all the other image URLs.
Note: In this example page, we need to select the IMG tag from the bottom of the Tips to locate the image URL. Only when the IMG is selected, Octoparse will show the option Image URL on the Tips.
2. Scrape images into different lines
It is also possible to scrape images to different lines of the same column using a loop extract action.
1) Select the first image -> Select the IMG tag
2) Choose Select all similar elements
3) Select Image URL
3. Scrape all images into one column
There are two ways to achieve scraping all images into one column.
Option 1. Merge the extracted image URLs into one line
Once you've loop extracted the image URLs into different lines (following the steps in Scrape images to different lines), you can then merge the extracted data to merge the lines into one single line.
1) Click the More icon for the data field, then select Merge field data
Option 2. Scrape the HTML code of the carousel and match out the image URLs from the code
1) Select the entire carousel and select OuterHtml
2) Click the More icon for the field and select Clean data
3) Click Add Step and choose Matching with Regular Expression
4) Inspect the code to find the starting value and ending value of the image URL
5) Click Try the ReEx tool
6) Enter Start with and End with value to generate a RegEx and apply the settings
7) Tick Match all and Confirm
Note: The image URLs scraped are thumbnail URLs. If you need to get the full image URLs, you can continue to add steps to reformat the field. Please check this tutorial: