Flows in Tx provide a structured way to automate and manage the execution of a data transformation pipeline. By creating flows, users can create, run, and validate transformation pipelines according to dependencies, ensuring smooth execution in complex projects.
What Are Flows?
A flow is a collection of Tx Objects and Sources executed in a defined order, also known as a directed acyclic graph (DAG). Flows allow for grouping and running multiple objects with their dependencies, making validating their structure easy and ensuring accurate processing. Flows can be tailored to project needs by including or excluding specific objects using flexible expressions.
Why Use Flows?
Flows simplify large-scale operations by:
Running multiple Tx Objects in sequence while respecting dependencies.
Validating object structure before execution.
Providing real-time status updates for each object.
Offering the flexibility to include/exclude objects dynamically using Boolean logic and attribute-value pairs.
Creating and Managing Flows
Flows can be created and maintained from the Flows section of the left-hand project menu.
Creating a Flow
Go to the Flows section in the Tx interface.
Click Add a Flow to create a new flow. This action can be performed directly in the plus icon (1), on the context menu of the flow (2), or if there is no flow created, from the button of the canvas (3). The flow will appear in the list.
The flow will contain all elements by default. Remove or add Tx objects to the flow using the include/exclude syntax explained below.
Defining Tx Objects in a Flow
Tx objects for a flow can be defined using an expression-based system. The expression consists of attribute-value pairs that allow precise control over which objects are included or excluded in the flow.
Syntax Overview
Flows can be constructed using a simple grammar that consists of expressions that filter/select Tx Objects and Sources from the project.
Flows grammar consists of expressions, which, in turn, consist of key-value pairs. Multiple expressions can be combined using Boolean logic.
Expressions are enclosed in curly brackets. Optional parameters are enclosed in square brackets.
Grammar:
[Predecesor]{Expression}[Successor] [Operator [Predecesor]{Expression}[Successor] ]
Syntax:
[+]{attribute:value [attribute:value]}[+] [ AND/OR/NOT [+]{attribute:value [attribute:value]}[+] ]
Case-sensitivity:
Flows support case-sensitive and insensitive attribute specifications through use of quoting operators.
Case sensitive: use single quotes for case-sensitive attribute values. Ex.
{templateName: 'Dimension'}
Case insensitive: use double quotes to match attribute values of any case. Ex.
{templateName: "Dimension"}
Example:
The following grammar defines a Tx Flow, which includes "SRC_CUSTOMER" and all its successors or any Tx Object that uses a Tx Template called "Dimension" (both are case-sensitive in this example).
{name: 'SRC_CUSTOMER'}+ OR {templateName: 'Dimension'}
Parameters:
Expression: {attribute: value [attribute: value]}
Available Attributes for filtering
name: Selects objects based on their name.
Example: {name: 'DIM_PRODUCT'}
location: Selects objects based on their location.
Example: {location: 'SG_TEST'}
templateName: Selects objects based on the template type.
Example: {templateName: 'STAGE'}
Example: {name: 'STG_CUSTOMER' location: 'STAGING' templateName: 'STAGE'}
Operators: Combine multiple expressions using Boolean logic operators:
AND: Selects objects that match all specified conditions.
Example: {location: 'REP'} AND {name: 'DIM_CUSTOMER'}
OR: Selects objects that match any of the specified conditions.
Example: {name: 'DIM_CUSTOMER'} OR {location: 'REP'}
NOT: Excludes objects that match a specific condition.
Example: NOT {name: 'STG_CA*'} excludes objects whose names start with 'STG_CA.'
Predecessors and Successors:
Use + before or after an expression to include predecessors or successors in the flow (as defined by ref functions). .
Example: +{name: 'STG_SUPPLIER'} selects all dependencies of 'STG_SUPPLIER.'
Example: {name: 'STG_SUPPLIER'}+ selects all objects dependent on 'STG_SUPPLIER.'
Advanced Syntax with Special Characters:
Wildcard Matching:
Use * to match multiple characters.
Example: {name: 'STG_C*'} selects all objects with names starting with 'STG_C.'
Use ? to match exactly one character.
Example: {name: 'STG_C?'} matches names like 'STG_CA' or 'STG_CB.'
Errors
When an expression is provided, the parser checks for errors in real-time. If any part of the expression contains an error, that section is highlighted, and a corresponding alert appears below the 'Statement' field. Upon detecting errors, the system displays the first error encountered and highlights it for immediate attention.
The invalid result will not be displayed if the expression has an error.
If multiple expressions exist, only expressions with valid key-value pairs will be displayed.
Editing Flows
In the properties panel of the flow, these activities are available:
Name: Check or rename the flow.
Objects: List of objects that appear in the flow.
Description: Description of the flow
Managing Flows
Flows can be deleted or duplicated.
Delete: Use the context menu to delete flows, this action won't remove included Tx objects.
Duplicate: Duplicate an existing flow to quickly create a new one based on the original.
Running and Validating Flows
Once a flow is defined, running or validating the flow ensures the Tx objects are correctly processed.
To be able to run a flow, the project needs to be in a saved state.
Flows are executed in a DAG order, ensuring that dependencies are respected. For instance, if an object depends on another, execution will only occur after the dependent object is completed.
Execution Modes
Create
• Validate Create All: Validates the creation of Tx objects without actually creating the objects, helping identify any potential errors before execution.
• Create All: Executes the creation flow and performs the creation operations on all included Tx objects.
Run
• Validate Run All: Validates the execution without actually running the objects, helping identify any potential errors before execution.
• Run All: Executes the flow and performs the run operations on all included Tx objects.
Flow Statuses
During and after execution, each Tx object within a flow is assigned a status reflecting its progress:
• Waiting: The object is queued and awaiting execution.
• Running: The object is currently being executed.
• Success: The object has been successfully executed.
• Failed: The object encountered an error during execution.
Status Behavior
If a Tx object fails, dependent objects will display a Waiting status until the failure is resolved.
Independent objects (those with no dependency on the failed object) will continue to run.
Each status is visible next to the Tx object in the flow diagram or the Main flow, allowing monitoring of progress and handling failures if necessary.
Viewing Execution Results
After running or validating a flow, results are displayed in the Results tab. This tab provides a summary of the execution, including success and failure statuses for each Tx object. Individual results can be expanded to see more details, such as error messages, detailed SQL statements or execution times.
Best Practices for Defining and Running Flows
• Use Boolean operators (AND, OR, NOT) to create precise conditions for including/excluding Tx objects.
• Regularly validate flows to catch potential errors before running.
• Leverage flow statuses to troubleshoot execution and make adjustments to Tx objects as needed.
See also: