Skip to main content

The statistics behind LASSO regression models

Updated over a year ago

LASSO (Least Absolute Shrinkage and Selection Operator) regression is an automated way of selecting predictive variables. The approach works by shrinking the sum of the absolute values of the regression coefficients. Specifically, the L1 norm of the coefficients is constrained, which leads to variable selection due to some coefficients being forced to zero.

Statistics

The LASSO regression is implemented using the cv.glmnet function from the glmnet R package (glmnet Documentation). A value of 1 is passed as the alpha argument, to enforce the algorithm to use LASSO exclusively, rather than other shrinkage approaches. Values of the independent variables are standardised to have a mean of 0 and a standard deviation of 1 using the standardize argument. A vector of λ values is passed as the lambdas argument. The λ value is a tuning parameter which controls the amount the absolute values of the coefficients are shrunk. Large values cause coefficients to be shrunk more, which leads to terms being excluded from the model as the value of their coefficients get shrunk to zero. Conversely, small λ values result in minimal shrinkage, causing more terms to be included in the model. There are 51 λ values passed to cv.glmnet, which are given by λ = 10^x where x takes values between 2.0 and -3.0 inclusive, in 0.1 increments.

In order to determine the λ value which leads to the optimal choice of model terms, cross-validation is used, with either 5-, 10-, or 20-fold cross-validation performed depending on the selection the user makes. The output of the cross-validation process is shown in the table and the graph in the Term Selection section of the Create Models tab.

The table has four columns:

  • Lambda - the value of λ used

  • Mean CV Error - the mean cross-validated error for that value of λ

  • SD CV Error - the estimate of standard error for the Mean CV Error

  • No. of Terms (exc. Intercept) - the number of model terms selected by the LASSO approach with that value of λ, not including the intercept

Note: The maximum number of terms in this table will be greater than the number of terms in your model if you have categorical factors. If a categorical factor has n levels, then n-1 coefficients are required to model it for every term in which it can appear. In the example above, there is a categorical factor with 3 levels and a continuous factor. 1 coefficient is required for the continuous factor, 1 coefficient is required for the quadratic of the continuous factor, 2 coefficients are required for the categorical factor, and 2 coefficients are required for the interaction term between the categorical and continuous factors. Adding all these up gives 6, which you can see in the graph is the maximum number of terms considered during the LASSO process.

The graph shows how the mean cross-validated error and the number of terms selected varies with respect to the λ value used. The bounds on the line of mean cross-validated error correspond to the estimate of standard error.

The model chosen by the algorithm is highlighted in the table. It is chosen by taking the minimum value of the mean cross-validated error obtained across all λ values and finding the largest λ value that gives a mean cross-validated error within one standard error of this minimum. This is an approach suggested by the authors of the glmnet R package (Tay et al. 2023). For example, consider this table:

The minimum value of the mean cross-validated error is 968.6758 when λ = 0.3162. The estimate of the standard error is 179.9526. The model chosen by the approach is therefore the model given by the largest λ value where the mean cross-validated error is smaller than 968.6758 + 179.9526 = 1148.6284. This corresponds to the highlighted model in the table where λ = 5.0119.

Citations

  • R version: 4.0.5

  • glmnet version: 4.1-2

  • Tay, J. K., Narasimhan, B., & Hastie, T. (2023). Elastic Net Regularization Paths for All Generalized Linear Models. Journal of Statistical Software, 106(1), 1–31. https://doi.org/10.18637/jss.v106.i01

To learn more about LASSO regression models, click here

To learn how to apply LASSO regression to your data in Synthace, click here (Coming Soon!)

To learn about other modelling techniques, click here.

To learn how to assess the quality of your fitted model, click here.

Did this answer your question?