A short description of the post.
It’s often impossible to collect all the samples for an experiment at once, which can lead to batch effects downstream. This motivates the use of blocked designs — how should we think of blocked designs in the context of response surface experiments?
First, let’s build intuition through an example. Suppose we are using a central composite design and \(K = 2\), but we can’t complete all 10 runs at once. At most, during a single pass, we can collect 5 samples.
Idea: Split the factorial and axial samples into two blocks, but tie them together at the center points.
The main technical condition that generalizes this idea is called block orthogonality. A design is called block orthogonal if,
Each block \(X_{b}\) is orthogonal, i.e., \(X_{b}^{T}X_{b}\) = I when we stack all design points for the block into an \(n_{b} \times K\) matrix \(X_{b}\).
For each factor \(k\) and each block \(b\), \[\begin{align*} \frac{\sum_{u = 1}^{n_{b}} x_{iu}^{2}}{\sum_{u = 1}^{N} x_{iu}^{2}} = \frac{n_{b}}{N} \end{align*}\] this ensures that no one block has undue influence over the eventual estimates.
For arbitrary number of factors \(K\), the central composite design turns can be made block orthogonal by dividing into two blocks (factor and axial points, tied at the center). It can be broken into even more blocks for particular choices of block size; this is summarized in Table 11.11.
We can use the same code we used to generate central composite designs before, but using the blocks
argument.
codings <- list(
time_coded ~ (time - 35) / 5,
temp_coded ~ (temp - 150) / 5
)
ccd_design <- ccd(~ time_coded + temp_coded, coding = codings)
ccd_design
run.order std.order time temp Block
1 1 7 35.00000 150.0000 1
2 2 1 30.00000 145.0000 1
3 3 6 35.00000 150.0000 1
4 4 5 35.00000 150.0000 1
5 5 4 40.00000 155.0000 1
6 6 3 30.00000 155.0000 1
7 7 2 40.00000 145.0000 1
8 8 8 35.00000 150.0000 1
9 1 1 27.92893 150.0000 2
10 2 2 42.07107 150.0000 2
11 3 7 35.00000 150.0000 2
12 4 8 35.00000 150.0000 2
13 5 4 35.00000 157.0711 2
14 6 3 35.00000 142.9289 2
15 7 6 35.00000 150.0000 2
16 8 5 35.00000 150.0000 2
Data are stored in coded form using these coding formulas ...
time_coded ~ (time - 35)/5
temp_coded ~ (temp - 150)/5
ggplot(decode.data(ccd_design)) +
geom_point(
aes(x = time, y = temp, col = Block),
position = position_jitter(w = 0.5, h = 0.5),
size = 3
) +
scale_color_brewer(palette = "Set2") +
coord_fixed()
Suppose we wanted to block more than just the factorial and the axial runs; for example, we may only be able to run 2 examples per batch. In this case, we might block the factorial runs by confounding a high-order interaction with the blocks. For example, the code below blocks the 32 runs of a \(2^{5}\) factorial into blocks of size 8.
Finally, let’s check the orthogonal blocking property for one of the blocks. Remember, we need to make sure that (1) the design within the block is diagonal and (2) the fraction of norm for a column is proportional to the number of samples within the block. This is checked below for Block 1.
ggplot(decode.data(blocked_ccd)) +
geom_point(
aes(x = time, y = temp, col = Block),
position = position_jitter(w = 1, h = 1),
size = 3, alpha = 0.6
) +
scale_color_brewer(palette = "Set2") +
coord_fixed()
x <- as.data.frame(blocked_ccd)
xb <- x[x$Block == 1, ] %>%
select(ends_with("_")) %>%
as.matrix()
t(xb) %*% xb # orthogonality
time_ temp_ power_ rate_ cooling_
time_ 8 0 0 0 0
temp_ 0 8 0 0 0
power_ 0 0 8 0 0
rate_ 0 0 0 8 0
cooling_ 0 0 0 0 8
[1] 0.1935484
12 / nrow(blocked_ccd) # nb / N
[1] 0.1935484