Sometimes we need to scrape the image URL from a website, but all we get is just the URL of a thumbnail picture instead of a normal-sized picture.
Here is a picture scraped from Amazon. As you can see, the image is too small to see.
To get the normal-size images, all we need to do is modify the image URL that we already have, following the steps below:
If you would like to know how to scrape the image URLs, you can refer to this tutorial first: Scrape images from a carousel
1. Observe the difference between the full image URL and the thumbnail URL
The URLs of different sizes usually only have a slight difference. We need to find the difference between the full image URL and the thumbnail URL
For example, the thumbnail on Amazon is like this
The full image URL is
You can see the thumbnail has 'SR38,50' in its URL. We just need to delete it from the URL.
In some cases, you may see the image URL contains the size number like 85X85 to indicate the size of the image:
You can try to use replace "85X85" with "1000X1000" to see if the URL is still valid:
2. Use the Octoparse Clean Data function to reformat the thumbnail URL into a full URL
Add a step as Replace
Type in the value you want to replace (SR38,50, for example) into the Replace box
Type in the value you want to replace it with in the With box
(In the case of the Amazon image URL, you need to delete the SR38, 50, which means to replace it with nothing. So you just need to leave the With box empty.)
Click Confirm to save
Click Apply to save the settings
Then you can get the full image URL you need in the final results.