Synthace provides the tools for performing both standard and user-defined data transformations. You can read more about the predefined common transformations, such as Log, Box-Cox and Yeo-Johnson, here. In this article, the focus is on defining a custom transformation, which may be necessary for data where the predefined transformations are inadequate or unsuitable.
How to use the custom data transformation
To transform the data using using a custom expression, navigate to the Select & Transform tab.
Performing a custom transformation:
On the left hand side of the page, from the Response or Transformation dropdown menu, choose the response which is to be transformed (e.g. ’R1’).
Underneath the menu, click Copy.
This will trigger a prompt to provide a name for the response which is to be copied (e.g. ‘R1 transformed’). It is advisable to do so, in order to leave the original response (’R1’) unaltered when proceeding with the transformation.
From the Apply new transform dropdown menu, choose Python Expression.
A custom Python expression can be typed in the box underneath the Apply new transform dropdown menu. The expression needs to be compatible with pandas Series or DataFrames, and is evaluated using pandas.eval. The output of this transformation must be the same size as the input response, or a scalar, which will be propagated to the entire transformation.
E.g. to double the value of response ‘R1’, type in
df[”R1”] * 2.0
. To find the average of response ‘R1’, type indf[”R1”].mean()
, which will propagate the average across the entire transform.When satisfied with the transformation, click Save under the Response or Transformation dropdown menu.
Alternatively, clicking Save As will provide the opportunity to rename the transformation.
If unsatisfied with the transformation, clicking Cancel will cancel all unsaved edits.
Resources
To learn about the statistics behind transformations, click here.
To learn to select and save subsets of your data, click here.
To learn how to apply a predefined transformation to your data, click here.
To learn how to apply predefined column based calculations to your data, click here.
To learn how to explore your models and make predictions, click here.