Statistical Experimental Design: Principles and Vocabulary

What is an experiment?

A test or series of runs in which purposeful changes are made to the input variables of a process or system so that we may observe and identify the reasons for changes that may be observed – Montgomery, pg. 1

More simply, in an experiment, our goal is to learn how inputs affect outputs. It’s not enough to passively watch – we need to see how turning certain “knobs” affects the system.

To illustrate, we can consider a planting example. There are a variety of factors that could influence how the plants grow (soil type, watering schedule, etc.). We could allocate different plots of land to trying different configurations of factors. At the end, we hope we can arrive at generalizable knowledge about which configurations we should use during future growing seasons.

Randomization

One of three key principles of experimental design is randomization. The book says that randomization has been applied if the,

Allocation of experimental material and the order in which the individual runs are performed are randomly determined – Montgomery, pg. 11

More simply – we should assign treatments using a coin toss (or random number generator). Why is this important? There are many factors besides treatment that can influence outcome. We don’t want these superfluous factors to bias our conclusions.
[Treating the sickest patients] What could go wrong in the absence of randomization? Suppose we are at a hospital and are trying to see whether a new treatment is effective. If we don’t randomize, we might end up only treating sicker patients than usual. If the sickest patients have worse outcomes on average, then we might underestimate the effectiveness of our treatment.
If we randomize, the differences coming from these extraneous factors (like amount of sickness) will cancel out. However, if the treatment does have an effect, we will be able to detect it.

Replication

Replication is the second of the three main principles of experimental design. A replicate is,

An independent run of each factor combination – Montgomery, pg. 12

Replication is important because it helps us understand run-to-run variation. If we had only grown the plant once, we’d have no idea about the range of variation we’d expect even when fixing the influential factors. Moreover, if we can get many replicates, our estimates of the mean will improve (orange -> red)

The book highlights a distinction between replicates and repeated measures. Repeated measures are several measurements on the same experimental unit. Replicates are distinct experimental units drawn under the same overarching conditions.
[Computer Chips]. If we were trying to build better computer chips, then repeated measures would be several measurements on the same chip. Replicates would be completely independent chips.

Blocking

The final major principle of experimental design is blocking, defined as,

A design technique used to improve the precision with which comparisons among the factors of interest are made. Often used to reduce or eliminate variability from nuisance factors.

[Shoe soles] Suppose that we want to test the difference between these purple and green shoe sole types (which wears down faster?).

Two designs seem natural,

Design 1 [No blocking]: Each person is randomly assigned a shoe sole type.
Design 2 Blocking: Each person gets one of each shoe sole, and randomly wears on on the left / right foot.

Under design 1, any true shoe sole effect would be drowned out by the amount of walking each person did In the blocked design, consistent effects within individuals become detectable. In this example, blocking helped remove nuisance variation resulting from some people walking more than others.

Each nearby pair of dots is a person assigned two of the same sole type. Differences between sole types are washed out by differences in how much people walk.

Figure 1: Each nearby pair of dots is a person assigned two of the same sole type. Differences between sole types are washed out by differences in how much people walk.

Figure 2: When each person is assigned one of each type, consistent improvements in the purple shoe sole become clearer.