You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!
The Merge field data function can be utilized to effortlessly merge data from various rows into a singular row.
For example, if you are extracting an article from a blog. In some cases, you might not be able to select the entire article to extract as there are different paragraphs, but you still want all the paragraphs in one single row
For instance, if you are extracting an article from a blog, there may be instances where you are unable to select the entire article for extraction due to the presence of multiple paragraphs:
However, it is still desirable to have all paragraphs in a single row.
How?
Step 1. Select the desired data to extract
Click on the first paragraph of the article and choose Select all similar elements in the Tips panel. A Loop Item will be created to extract every paragraph of the post.
Step 2. Merge the extracted data
Click on the Extract Data step and go to the Data Preview panel
You are all set! Let's run the task and see what the actual exported data looks like. You can see that paragraphs captured in Field 1 are now merged into a single row as one big chunk.
Note:
The utilization of merge field data proves to be particularly advantageous in the extraction of articles from various websites. This method allows for the retrieval of the entire article as a cohesive unit, devoid of any extraneous elements such as blank lines, comments, or images.
When the data is consolidated into a single entity, one may utilize the Data reformat tools to append a prefix or suffix, such as "|" and "\", to reformat the data.
If there are multiple fields to extract, you need to set up "Merge field data" for every field.
This feature can also be used to merge two fields. Use two Extract Data in the workflow, one field in one Extract Data action, then name the fields the same and set the "merge field data" for the fields. As a result, the data scraped in the two fields will be merged into one cell.
This feature cannot be previewed. The data will only be merged when the task runs.