The Blendr.io Data Store is a database that can be used to store any type of items, e.g. Contacts, Products from a webshop or Orders. It can be used as permanent data storage or as a temporary data cache (*).
A Data Store is especially useful if you want to process items, and you want to keep track of the status for each item.
There are other ways of processing items "exactly once" in Blendr.io. Some connectors have blocks to retrieve data incrementally, by using an internal pointer. The Data Store however, is more advanced, since it keeps a status of each individual item and allows retrying individual items.
(*) Note on data caching: it is also possible to configure caching on individual blocks in a Data Blend. The Data Store however gives you more control since you choose which items you add to the Data Store etc.
Creating a Data Store
In Blendr.io, go to the menu in the top right corner and select "My Data Stores". Now you can add a Data Store, change the settings of an existing Data Store, and see the items inside a Data Store.
Click the database icon to view items in a Data Store.
Click the pencil icon to change the settings of a Data Store.
Once you have created a Data Store, you can add items to it in a Data Blend, using the Data Store block "Add item".
Status of items
Each item in a Data Store has a status:
- Ready to process: a new item, not yet processed by a Data Blend
- Processed: an item that has been processed by a Data Blend
- Error: an item that has been processed but the processing failed. It will be retried based on the settings for a Data Queue (see below).
- Failed: an item that has been in Error and the processing permanently failed, based on the settings for a Data Queue (see below).
The Data Store is a great way to avoid duplicates. Each item in the Data Store has a unique key. The unique key can be anything, but typically it is the id of the item, e.g. an order id when you are processing orders from a web shop.
For each Data Store, you can configure how the Data Store handles adding new items with the same unique key. See setting "Action to perform on duplicate". You can choose between 3 settings:
- Update: when you add an item with a unique key that already exists in the Data Store, the existing item will be updated, it's status will NOT change. So if the existing item has status "Processed", it will NOT be processed again after the update.
- Replace: when you add an item with a unique key that already exists in the Data Store, the existing item will be replaced with this new item, and the new item will have a new status "Ready to process".
- Do nothing: when you add an item with a unique key that already exists in the Data Store, the new item will be ignored.
Example scenario: process orders exactly once
A typical scenario is that you have a Data Blend that reads items from a data source, and writes them to the Data Store. For example you may read orders from your webshop every day and store them in a Data Store for further processing.
The Data Store will automatically avoid creating duplicates, so in the Data Blend you could simply retrieve ALL orders every time and add them to the Data Store over and over. There's no need to read them incrementally. In the Data Store, every order will only exist once.
For this scenario, make sure to set "Action to perform on duplicate" to "Do nothing".
A second Data Blend can read orders from the Data Store, process them and set the status of each order to "Processed".
Add items to a Data Store
Use the Data Store block "Add item" to add an item to a Data Store. Here's an example that retrieves attendees of an event from Eventbrite, and adds the attendees one by one to a Data Store called "Event attendees". Note that the whole item (object) of each item is sent to the Data Store (input "Data"), and that the attendee id from Eventbrite is used as unique key to avoid duplicates (input "Uniquekey").
Viewing items in the Data Store
The above Data Blend could be scheduled to run every hour, and it will not create duplicates thanks to the unique key. You can view the items in the Data Store by going to "My Data Stores", selecting the Data Store "Event attendees", and clicking on the Database icon:
You can click on one item to see the item in full:
Processing items from a Data Store
You can create a Data Blend to process items from a Data Store. Use the block "List items to process" to get items from a Data Store with status "Ready to process". In the following example, we are processing items and adding them to a Google Sheet:
Note that when this Data Blend has executed, all items that were processed will automatically be set to status "Processed", unless one of the blocks in the loop of the block "List items to process" returned an error. If an error occurs in the loop, the current item will be updated to status "Error" in the Data Store.
On the next run of the Data Blend, the block "List items to Process" will output all items in the Data Store with status "Ready to process" AND status "Error". This allows you to retry processing items. Note: you can configure this behaviour using the Data Queue settings, see below.
It is also possible to update the status of items explicitly, by using the block "Update item status".
The Data Store can also be used as a Data Queue. The behavior is exactly the same, except that you can configure how long and how many times processing is retried for individual items, when the first processing failed, and how long items are kept in the Data Store before being deleted (data retention).
In the Data Store settings, set the type to "Data Queue". You will now have extra settings:
- Process expiry days: the number of days items are kept in status 'Error' before being transitioned into status 'Failed'.
- Process retention days: Number of days items are stored in the Data Store before being deleted.
- Max retries: amount of retries before an item is being transitioned into status 'Failed'.
- Amount of seconds before retrying: Number of seconds to wait before reprocessing an item in status 'Error'. For example, if you have a Data Blend with Data Store block "List items to process", that runs every hour and this value is set to 86400, then each item in Error will only be offered once a day to the Data Blend for reprocessing.
Keywords for searching: data store, datastore, data queue, dataqueue