LASSO regression searches for significant terms by optimizing the whole model. It uses a technique called shrinkage in which it penalizes small terms using a penalty term (位), essentially making terms below a certain size zero.
By starting from a model with all possible terms and applying shrinkage at a particular level a selection of model terms can be defined as all those which did not get shrunk to zero.
The question is then how to choose the shrinkage parameter: lower values will lead to bigger models, higher values smaller models. How do we make a principled choice?
The procedure implemented by Synthace, which is typical, involves searching many values of 位 and assessing them by applying cross-validation. This is a procedure in which the data are split into multiple partitions. Each partition is used to test the model fitted on the remaining data. The performance is then reported as the average over all the partitions.
Running this cross-validation procedure for each of a series of 位 values then allows the 位 which led to the best cross-validated performance to be chosen, the terms in the resulting model are then reported.
To learn more about the statistics behind fitting a Lasso regression model, click here.
To learn how to apply LASSO regression to your data in Synthace, click here (Coming Soon!)