A key assumption of DOE is that every run is performed independently of the rest. This means that we can treat all the errors arising from the experimental process as independent and uncorrelated, a requirement for significance tests and linear model building.
However, real experiments often include situations that require you to treat a set of runs as a group, rather than individually. Typically this is because it is costly and inefficient to apply the given treatment to the runs individually. A common example in biology is when incubating samples - while it is possible to assemble the mixtures individually in the wells of a plate, most incubators heat the whole plate at once, meaning we are now applying incubation to the wells in the plate as a group.
This lack of randomization has an impact on the analysis of the data, and failing to account for it can lead to drawing incorrect conclusions. To mitigate this, it’s best to identify this situation upfront and build it into the design. This amounts to designating the particular factor as hard-to-change. Building an optimal design containing a hard-to-change factor leads to a type of experiment known as a split plot.
Statistics
Split-plot designs are defined as having factor treatments applied at two levels: whole plots and subplots. The subplots are grouped by the whole plot they are contained in. Like the subplot level factors, the whole plot factors are part of the model to be fit using the data resulting from running the design.
Split Plots vs Blocking
There are two key features that the split plot design needs to ensure:
There is no confounding between whole-plot-level factors and subplot-level factors,
Model terms including the whole-plot-level factors are efficiently estimable.
In reality, criterion 2 subsumes criterion 1: estimation of model terms is a function of how confounded they are with other model terms. However we have presented them in this way for a reason: criterion 1 defines how blocking is handled, whereas criterion 2 is what split-plot experiments are for. It is common for people to conflate one with the other, so we hope this helps clarify the difference. In essence, a blocked experiment is one in which runs are grouped but the groups have no other significance - we just happen to need to execute runs in groups. For split plots, we have the additional feature that the groups are actually experiencing different experimental treatments.
From a design perspective, this means that since blocks do not differ in the experimental treatments, it is sufficient to ensure that the runs are evenly distributed with respect to all the effects of interest - thus ensuring that no block tells you more about some effect than the others do.
Split plots, by contrast, assign different experimental treatments to the blocks (the blocking name for whole plots). So now we not only need to make sure the runs are evenly distributed between the plots, we also need to guarantee that we account for how the levels of the whole-plot factors are distributed between the different whole plots and, likely, consider the estimability of effects that cross the two levels: interactions between whole-plot factors and subplot factors.
Assessing Split Plot Designs
Determining how well split plot designs will perform in a particular experiment follows the same pattern as other optimal designs (see here for a full description). However, split plots have one difference from the general case, which is important to understand.
Since plot-level factors are run on blocks of runs, rather than each run individually, the effective sample size for any comparison of effects is defined by the number of whole plots, rather than the number of runs in the design.
The upshot of this is that the power to detect these effects is very low, compared to the power you have to detect the subplot effects in most cases, since the number of whole plots is small compared to the number of runs in the whole design. This can seem like a potential problem when assessing a split-plot design, however not necessarily a big one. One reason for this is that in the case of many real whole-plot effects (e.g. temperature, shaking speed etc.) the effect size is actually quite large and therefore reasonably easy to detect.
A more technical reason is that, while the power to detect the whole plot effect as a main effect is indeed low in most cases, the power to detect interactions between whole plot effects and subplot effects is similar to the power you have to detect other interactions, as in this case the sample size is a function of the subplot effect’s run number.
So, while the whole plot effects might not always appear significant, if they interact with any of the other factors in the design (again quite likely for the kind of factors we usually assign to whole plots) these interactions will be much easier to detect and will pull the whole-plot effects into the model as a consequence of effect hierarchicality.
Note that, unfortunately, the lower power in detecting whole-plot level effects also means that estimates of the coefficients of those effects are less precise. The trade-off is that estimates of the coefficients of the sub-plot level effects gain in precision.
Analysis of Split Plot Designs
The typical analysis of split plots requires use of what statisticians call mixed effect models. The reason for this is that, as we discussed above, effectively the model for a split plot experiment actually contains two levels, with the subplots nested within the whole plots.
This has the effect that there are now several sources of random variation to account for, one for each whole plot. Additionally we have the effects of the different factors to model.
You may recall that in standard linear models there is an assumption that all of the noise in the system can be modelled with a single parameter: the standard deviation of a normal distribution.
Mixed effect models allow you to relax this assumption by having two kinds of model terms: fixed effects and random effects. Fixed effects are used to model the change in the mean of a response in relation to a change in one or more factors. Random effects are used to model how changes in some variable affect the noise in the system.
Split plot analyses use random effects to allow each whole plot to have its own error term associated with it - this does not depend on any of the factors but acknowledges that the grouping of runs into whole plots is likely to make runs within groups more similar to their counterparts in other groups.
Synthace does not presently allow users to run mixed-effects models and so, to do the correct analysis of this experiment type, it is necessary to export the structured data to a statistical package that does support such analyses.
Implementation Details
Please see the article on optimal designs for relevant implementation details.
Further Reading (optional)
Read this seminal paper describing what split-plots are and how they arise in real life.
To learn how to define a hard to change factor in Synthace, click here.
To learn about other factor types, click here.