Stream DB V3 allows the establishment of a connection with a service that supports the JDBC (Java Database Connectivity) protocol and the execution of the SQL (Structured Query Language) instructions.

Differently from the DB component, Stream DB has been designed to make execution in batches, which means, each return (resulting line or row) of the executed SQL instruction is individually treated through a subpipeline, being able to have its own processing flow. Learn more about subpipelines by clicking here.

Take a look at the configuration parameters of the component:

  • Account: for the component to make the authentication to a JDBC service, it's necessary to use a BASIC or KERBEROS-type account (check the topic "Authentication via Kerberos").
  • Database URL: URL (Uniform Resource Locator) to establish connection to the database server with support to the JDBC protocol. This parameter supports Double Braces.
  • SQL Statement: SQL instruction to be executed.
  • Column Name: if there's an error in the subpipeline processing, the value associated to the colum defined in this field will be added to the error message in the field "processedId", as shown below:
{
"timestamp": 1600797662733,
"error": "Error message",
"code": 500,
"processedId": "2"
}

  • Parallel Execution Of Each Iteration: when activated, this option causes each one of the passes through the pipeline to be made in parallel, reducing the total execution time. However, there's no guarantee that the items will be executed in the order returned by the database.
  • Blob As File: if activated, this option causes the blob-type field to be stored in the pipeline context as files; otherwise, the fields are stored as normal texts (strings) and coded in base64, as follows:
// "Blob As File" true
{
"id": 12,
"blob": "d3X8YK.file",
}

// "Blob As File" false
{
"id": 12,
"blob": "iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAIAAACQkWg2AAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAAeSURBVDhPY1Da6EMSYiBJNVDxqAZiQmw0lAZHKAEAaskfEED3lr0AAAAASUVORK5CYII="
}

  • Fail On Error: if the option is activated, the pipeline with error execution will be suspended; otherwise, the pipeline execution continues, but the result will show a false value for the success property.
  • Custom Connection Properties: specific connection properties defined by the user.
  • Keep Connections: if activated, the option will keep the connection with the database for a maximum of 30 minutes; otherwise, it will be for 5 minutes only.
  • Advanced: advanced configurations.
  • Output Column From Label: for some databases, it's important to keep this option activated if your SELECT is using any alias, because that way you guaratee the name of the column will be shown exactly like the configured alias.
  • Connection Test Query: SQL instruction to be executed before each connection is established (i.e. select 1 from dual) - this parameter is optional and must be applied only to databases that don't have reliable information about the connection status.

Technology

Authentication via Kerberos

To make an authentication to a database via Kerberos, it's necessary to:

  • inform a KERBEROS-type account
  • set a main Kerberos principal
  • set a keytab (that must be the base64 of the own generated keytab file)

Messages flow

Message structure available in the onProcess subpipeline

Once the SQL instruction is executed, the subpipeline will be triggered receiving the execution result through a message in the follwing structure:

{
"column1": "data1",
"column2": "data2",
"column3": "data3"
}

Output with error

{
"code": error_code,
"error": error message,
"processId": the_id_column_value
}

Output

After the component execution, a message is returned in the following structure:

{
"total": 0,
"success": 0,
"failed": 0
}

  • total: total number of processed lines
  • success: total number of successfully processed lines
  • failed: total number of lines whose processing failed

IMPORTANT: to detect if a line has been correctly processed, each onProcess subpipeline must respond with { "success": true } to each processed element.

Stream DB V3 makes batch processing. To better understand the concept, click here.

Did this answer your question?