Skip to main content

Blocking factors

Updated over a year ago

Sometimes your experimental design contains blocks of runs or samples which share an experimental feature. For example, perhaps your experimental design is spread across multiple plates. Although every effort may be taken to treat all samples, regardless of plate, in the same way, the samples that share a plate are statistically more similar to those that are on different plates. It is important, therefore, that these structures in the experimental design are taken into account when analysing data from the experiment. The factors we introduce to account for these design structures are called blocking factors.

It is important to account for blocking factors when performing your analysis as it is possible for design factors to be correlated with blocking factors. In this situation, if you do not take blocking factors into account, it is possible for you to draw inaccurate conclusions from the analysis.

To take an example, we may be performing a DOE with 5 factors, with runs that are split across two plates. One of these factors, Factor X, has two levels, A and B. When runs are randomised, we find that the runs on plate 1 have 80% A, 20% B and the runs on plate 2 have, therefore, 20% A, 80% B. Unbeknownst to us, the plates we have used are made from different batches of plastic that have subtly different optical properties that affect the absorbance values we measure. If we didn’t take into account for the different plates as a blocking factor, we may attribute this absorbance difference to Factor X, due to its correlation with the plate. Including the plate-based blocking factor as a main effect in the model, however, allows us to get a better idea of the true effect of Factor X on our absorbance readings.

Within the Response Analysis application, if a blocking factor is present, the factor is automatically selected and added to the default model as a main effect.

Find Effects tab

On the Find Effects tab, nested model comparison is used to find terms which significantly improve the fit of a model when they are included. This involves comparing a restricted model, which does not include the term of interest, and a full model, which does includes the term of interest. When blocking factors are selected, the blocking factors are added to both the restricted and the full model as main effects. This ensures that any effect the blocking factors are having on the response are not influencing the testing of the term of interest.

Additionally, the effect of the blocking factors is removed from the plots shown on the Find Effects tab. To illustrate how this works, take this example where the plate-based blocking factor is having a marked effect on the response:

The DNA 1 factor has two levels, 5 and 10. Some jitter has been applied to allow for individual points to be distinguished more easily.

Excluding the plate-based blocking factor results in the same data as shown in the figure above being plotted. However, when the plate-based blocking factor is included the response values are adjusted to remove the effect of the factor, which you can see as the spread of the points at each DNA 1 factor level is reduced:

This adjustment is performed by multiplying the fitted coefficient for the blocking factor with the columns of the design matrix corresponding to the factor, and subtracting the result from the response. In the example above, the following model is fitted:

With the Adjusted Response being:

Create Models tab

On the Create Models tab, the blocking factors that are selected in the factor selection sidebar are included as main effects in the model, which can be validated by the coefficients for the effect appearing in the Coefficients table. Selected blocking factors are included in your model as main effects only; interaction terms including blocking factors are not currently supported.

Unlike other factors, the blocking factors do not appear in the term selector in the middle of the page. Although we recommend including blocking factors in your model, there may be situations where you would like to remove it, for example, if it is highly correlated with another effect that you want to model instead. If this is the case, blocking factors may be excluded from the model by deselecting them from the factor selection sidebar.

The two automatic term selection methods available in Synthace, Stepwise and LASSO, both treat blocking factors differently to other factors, ensuring that they are always included in the models and cannot be removed as part of the term selection process.

Stepwise term selection

For Stepwise term selection, blocking factor main effects are included in the base model. This can be seen in the Forward table above, where the base model just consists of the intercept and plate-based blocking factor terms. Stepwise term selection is implemented in Synthace using the stepAIC function from the MASS R package (MASS Documentation). If blocking factors are present, they are included in the lower model, which is passed to the scope argument of stepAIC.

LASSO term selection

For LASSO term selection, the coefficients for blocking factors are not subject to shrinkage. As shown in the figure, this means that even with very large λ values, when the coefficients for all other terms have been shrunk to zero, we still have blocking factor terms included in the model. LASSO term selection is implemented in Synthace using the cv.glmnet function from the glmnet R package (glmnet Documentation). If blocking factors are present, the penalty.factor for the blocking factor terms is set to zero, with the penalty.factor for other terms set to one. As documented, this means that shrinkage is not applied to the coefficients for the blocking factor terms.

Explore Model and Explore Multiple Models tabs

On the Explore Model and Explore Multiple Models tabs we do not show sliders to alter the value of blocking factors, and the blocking factors do not influence model optimisation. Instead, the models are fitted with the blocking factors, then the blocking factor terms are removed and the average effects of the blocking factors are used as constants in the model instead. By averaging out the effects of the blocking factors, we ensure that the optimisation will not be biased and the response values we show reflect the average across blocking factor levels.

Further reading

For more information about nested model comparison for finding effects click here.

For more information about Stepwise term selection click here.

For more information about LASSO term selection click here.

Did this answer your question?