Mutual exclusion: what is it and why is it special?

It’s a common occurrence in biological experiments that we want to try out different buffers or reagents in a reaction. It’s also usually the case that these different chemical agents will have different levels of potency, requiring different concentration or volume ranges to be used for one vs the other.

DOE designs are capable of accommodating this situation, however if you just try and create a design with these properties directly you will immediately hit a problem. The problem is that the maths used in DOE doesn’t work if you can perfectly predict what’s in any column of the design using any combination of the others.

In this case where you want to have two columns, one including (say) volume of buffer A and another including volume of buffer B you would be in this situation since if A and B are strictly alternative to each other you have all places where buffer A is zero as non-zero for B and visa versa. This leads to the design matrix becoming singular. This is something which design software can’t cope with, and will lead to an error. However there’s nothing actually wrong with what you’re trying to do, it’s just a question of making sure you frame it in a way the designer can cope with.

Solving the problem

To get what you’re looking for the trick is to make sure that the way you code your mutually exclusive level choices is in such a way as to ensure the design remains non-singular. The logic behind this is fairly straightforward: the choice of component is encoded as a categorical factor with the appropriate number of levels. To get the ability to have different ranges for each level we then introduce a second variable to indicate how much of the component we want and simply re-scale it to fit whichever interval we want for each factor level.

This works because the choice of buffer and the relative amount of it are independent of each other, the dependency between the two is just in how you scale that relative amount to encompass the difference between having a lot or a little of that component.

Why this is important

Fundamentally it’s assumed that the levels in a design always have the same meaning independently of all the levels of all the other factors. So if you have a “high” level of buffer it should always be high regardless of which buffer it is.

So it’s actually essential to do tricks like this if you want to meaningfully compare different choices. To help see this, imagine the simpler situation where we want to compare ways of sweetening tea just using a single categoric variable (”sugar” vs “sweetener” for example).

For most artificial sweeteners the actual mass you need to achieve a given degree of subjective sweetness is much, much smaller than it is for sugar. So equalising the mass here would not make sense. If you did, you’d actually be comparing (e.g.) sweet tea with unsweetened tea, if you were to normalize based on a dose of 0.1g of sweetener.

By the same token if you wanted to simultaneously examine the type of sweetener with its amount you would have to define “high” and “low” differently for each choice. Otherwise there is a trivial dependency between the factor representing type of sweetener and the factor representing the amount of sweetener, and there’s no point doing the experiment since the analysis won’t make any real sense.

Under the Hood

Dealing with this situation is generally possible in most design packages, but it’s an example of how the requirements of the design (strict independence between all columns in the design) and the requirements of the experiment (a simple table showing you how much of everything you need to add to each mixture) are opposed.

Since third party design software exists to make designs, it typically won’t help you in this case - it’s up to you to work this out and do the work required to create the design in the right way then map the resulting factors into the table required to implement the experiment.

Synthace’s design platform handles this situation automatically when you ask for mutual exclusion by:

Creating a categorical variable representing the type of something
Creating a scaled numerical variable representing the amount of that thing
Automatically setting up the equations to convert amounts from the scaled form into the actual amounts for implementation
Unpicking all this at the analysis stage so that what you analyse is still in terms of the type and amount as independent (but likely interacting) factors.

Well done on making it to the end of this tutorial.

To learn how to define a mutually exclusive factor, click here.

What is a mutually exclusive factor?

How to define a mutually exclusive factor

What are the statistics behind optimal designs?

What are derived categorical factors?

How are Hard To Change factors handled statistically?

The statistics behind mutually exclusive factors

Mutual exclusion: what is it and why is it special?

Solving the problem

Why this is important

Under the Hood