What is a Tx Object?
Tx Objects form the foundation for the transformational pipeline. Represents a discrete step in the data processing pipeline, capturing business logic, structure, and source references.
A Tx Object combines business logic and a customizable boilerplate template code to create the transformation from a Tx Template.
The business logic within a Tx Object includes the precise instructions and conditions tailored to the organization's needs, ensuring that data transformations align with business goals and compliance requirements. Meanwhile, the boilerplate template code provides a standardized foundation, streamlining the creation and maintenance of these objects by offering reusable, pre-defined and customizable structures and patterns.
This combination not only enhances consistency and efficiency but also reduces the likelihood of errors, as the underlying template ensures that essential elements and best practices are consistently applied across different transformations.
Tx Objects require a source node (Tx Source or Tx Object) and a Tx template, which together define the capabilities of the Tx Object. A Tx Object can have multiple source nodes.
Working with Tx Objects
How to create a Tx Object?
The Tx Object requires at least one previous node created in the project Directed Acyclic Grap (DAG).
The next video showcases how to create a Tx Object.
Once a Tx Object has been created, it will appear in the left-menu under the same Location of the node source.
It will also appear in the DAG as a direct arc with the previous source node/s.
Editing a Tx Object
There are 3 ways to access the edit screen of a Tx Object.
From the left-side menu:
Select the contextual menu from the Tx Object on the left-side pane and select open.
From the DAG select the contextual menu from the Tx Object and select edit.
Double click on the Tx Object.
Once the Tx Object is in edit mode the object will be represented in 2 sections:
Column grid.
Logic.
The Column Grid Section
The column grid showcases the data structure (column names, keys, constraints, descriptions) of the Tx Object.
The column grid will be automatically generated from the source node/s and considering the options from Tx Template selected to create the Tx Object. All the proposed columns can be modified.
All columns will be inherited from the source, plus all the systems columns generated in the template.
In the example below, a new column called __load_date has been generated.
Editing the columns in the Column grid
Column name, Data type, constraints, default values, descriptions and position can be edited directly in the Column grid.
Keys can be only editable through the configuration of the properties of the Tx Object.
Some of those options are showcased in the next video.
Check the related Tx Template article in the See also section to understand how to create a new template or change the default one.
The Logic section
In this section, a custom SQL can be created for any desired transformation purpose as long as the code is compatible with the destination database SQL requirements.
You can refer to any previous node using the ref option.
Using refs
The refs have 2 fundamental functions:
They create the dependencies between objects in the DAG.
They use a Jinja function to refer to a fully qualified name.
This is a key feature in a Tx project that allows the building of a dependency graph of the transformations, ensuring that Tx Objects are built and executed in the correct order and allowing code reuse.
The way to reference any previous Tx Object is by providing the Location ID and the Tx Object name.
Refs are available in three flavors that include all or part of the linking and naming behavior:
ref - as described above, a standard ref serves two functions: to create a dependency between referenced and referencing object, and compile to the fully qualified name of the former.
ref_text - does not create a dependency on the DAG and only prints the fully qualified name of the referenced object.
ref_link: only creates a dependency but does not print the fully qualified name of the referenced object.
Validating SQL Logic
There is the possibility to validate the SQL statement created in the logic part:
Press the arrow in the select data button.
Select validate select.
A validation plan will appear showcasing if the SQL Logic is compatible with the destination database.
Refresh columns
The Column Grid and the Logic sections should be aligned as they are the foundation for the transformation logic for the Create and Run commands generated using Jinja in the corresponding Tx Template.
To check that both the Logic and the Column grid are aligned, use the option Refresh columns (1).
The system will detect if there is any difference between the Column Grid and the Logic sections.
In the example below, a popup will appear to confirm the alignment of both sections:
The column __load_date is part of the Column grid.
The column __load_date is not part of the Logic.
A popup will appear to confirm if it's Ok to align both sections. The system will delete the columns from the Column Grid that don't exist in the Logic section.
In the future we will allow to setup a model contract to avoid changes in the Column grid.
The Create command
The Create command will execute the code based on the Jinja code of the Tx Template on the destination database, the code does not run locally on Tx.
Usually, this command will compile a create statement of the Tx Object executed against the database. This is executed at deploy time, setting up the data structure without loading data.
Validating and executing the Create script
In every Tx Object, the create command can be validated previously to create the object in the destination database.
To run the Create command select the Create option.
To validate the Create command, expand the create button and
Select the validate create option.
To check the outcomes of the validation or creation process, refer to the Results screen as demonstrated in the following video.
On the Results screen a compiled SQL generated from the Tx Template of the Tx Object will be found.
The Run command
The Run command will execute the code based on the Jinja code of the Tx Template on the destination database, the code does not run locally on Tx.
Usually, the run command is a Jinja template that defines how data is loaded into the object. The compiled run template results in the execution of SQL statements, such as INSERT INTO
, MERGE INTO
, or other SQL commands, to perform the necessary data operations.
This is called on demand through a scheduler, allowing for flexible data loading based on defined schedules.
Validating and executing the Run script
In every Tx Object, the Run command can be validated previously to Run the object in the destination database.
To execute the Run command, select the Run option.
To validate the Run command, expand the Run button and
Select the validate Run option.
To check the outcomes of the validation or creation process, refer to the Results screen as demonstrated in the following video.
On the Results screen a compiled SQL generated from the Tx Template of the Tx Object will be found.
Duplicating Tx Objects
There is the possibility to duplicate a Tx Object.
All the properties and references to other Tx Objects will be copied to the new object.
Select the contextual menu from the desired Tx Object to copy from the DAG.
Select the duplicate option.
The new Tx Object will appear in the DAG.
Deleting Tx Objects
Tx Objects can be completely removed from the Tx project. The references with other Tx Objects will disappear, but the referenced Tx Object will remain in the DAG and in the project.
To delete the Tx Object, follow the next steps:
Select the contextual menu from the left-side pane or
Select the contextual menu from the DAG.
Select the Delete option.
A confirmation popup will appear to confirm the deletion.