All Collections
FAQ
How can I keep the duplicates in the Cloud runs?
How can I keep the duplicates in the Cloud runs?
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

When you run a task multiple times, you may see Octoparse showing duplicates on the Dashboard:

This is because Octoparse will store the data scraped from all the runs together and recognize duplicates. Duplicates will be deleted automatically from the Cloud.

Duplicates are data lines that are the same in all columns. If you want to keep all the data lines from each run, you can try to add the Current date & time as a field in the task.

Go to the Data Preview, click on Add Custom Field button, and choose Current date & time

current_time.jpg

The field will be added like this:

mceclip1.png

The field indicates the date and time this data row is scraped. As every row is scraped at a different time, every row is different now. There won't be any duplicates.

Did this answer your question?