Reducing the number of samples required in factorial designs.
label | A | B | AB |
---|---|---|---|
(1) | - | - | + |
a | + | - | - |
b | - | + | - |
ab | + | + | + |
label | A | B | AB | C |
---|---|---|---|---|
(c) | - | - | + | + |
a | + | - | - | - |
b | - | + | - | - |
abc | + | + | + | + |
\(2^{K - 1}\): This denotes a fractional factorial design where we take \(\frac{1}{2}\) of the samples in a \(2^{K}\) design.
Aliases: Above, we assumed that AB was null. Suppose that it weren’t, though. Notice that the contrasts used for the pairs below are all equal, \[\begin{align*} \left[AB\right] &= \frac{1}{2}\left(c - a + b + abc\right) = \left[C\right] \\ \left[BC\right] &= \frac{1}{2}\left(-c + a - b + abc\right) = \left[A\right] \\ \left[AC\right] &= \frac{1}{2}\left(-c - a + b + abc\right) = \left[B\right] \end{align*}\] and we have no conclusive way of distinguishing between these aliased effects, besides appeals to the hereditary principle or domain knowledge. To denote this unidentifiability, we will use bracket notation, for example, \[\begin{align*} \left[A\right] &= A + BC. \end{align*}\]
Generators: In the example, we set \(C = AB\). Multiplying both sides by \(C\) and using the fact that \(C^2 = I\), we find \(ABC = I\). The word ABC will be called the generator for this fraction.
Complementary designs: The complementary fraction of a fractional factorial design are the corners on which we didn’t take samples. Often, if there are strong aliased effects in an initial fractional design, the complementary fraction will be run in the next experiment.
Resolution: A fractional design has resolution \(R\) if no \(p\)-factor is aliased with an effect containing less than \(R - p\) factors.
\(R\) | \(p\) | \(< R - p\) | Interpretation |
---|---|---|---|
3 | 1 | \(\leq 2\) | Main effects aren’t aliased with other main effects, but could be aliased with two-way interactions |
4 | 1 | \(\leq 3\) | Main effects aren’t aliased with any other main effects or with any two-way interactions, but could be aliased with three-way interactions |
2 | \(\leq 2\) | Two-way interactions aren’t aliased with main effects. | |
5 | 1 | \(\leq 4\) | Main effects aren’t aliased with other main effects, two-way, or three way interactions. |
2 | \(\leq 3\) | Two-way interactions aren’t aliased with with main effects or two-way interactions. |
Let’s suppose we only had only run half of the runs in the filtration example from week 9 [1]. The original experiment is a \(2^ 4\) design; we will choose the fraction corresponding to the word1 \(I = ABCD\).
This is a general principle for \(2 ^ {K - 1}\) designs: choose which configurations to run by defining the relation \(K = A B C \dots J\). This turns a full \(2^{K - 1}\) factorial design, which only studied \(K - 1\) factors into a fractional \(2^{K - 1}\) factorial that studies \(K\) factors.
filter
command does below – it removes all samples that are not consistent with the \(D = ABC\) fraction.
Call:
lm(formula = Rate ~ A * B * C * D, data = filtration)
Residuals:
ALL 8 residuals are 0: no residual degrees of freedom!
Coefficients: (8 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.75 NaN NaN NaN
A 9.50 NaN NaN NaN
B 0.75 NaN NaN NaN
C 7.00 NaN NaN NaN
D 8.25 NaN NaN NaN
A:B -0.50 NaN NaN NaN
A:C -9.25 NaN NaN NaN
B:C 9.50 NaN NaN NaN
A:D NA NA NA NA
B:D NA NA NA NA
C:D NA NA NA NA
A:B:C NA NA NA NA
A:B:D NA NA NA NA
A:C:D NA NA NA NA
B:C:D NA NA NA NA
A:B:C:D NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 7 and 0 DF, p-value: NA
TRUE
elements outside of the diagonal are aliased effects. For example, the TRUE
in column A on row BCD means that A is aliased with BCD.X <- model.matrix(fit)
t(X) %*% X != 0 # TRUE on off diagonals are aliases
(Intercept) A B C D A:B A:C B:C
(Intercept) TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
A FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
B FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
C FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
D FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
A:B FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
A:C FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
B:C FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
A:D FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
B:D FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
C:D FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
A:B:C FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
A:B:D FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
A:C:D FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
B:C:D FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
A:B:C:D TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
A:D B:D C:D A:B:C A:B:D A:C:D B:C:D A:B:C:D
(Intercept) FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
A FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
B FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
C FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
D FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
A:B FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
A:C FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
B:C TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
A:D TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
B:D FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
C:D FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
A:B:C FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
A:B:D FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
A:C:D FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
B:C:D FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
A:B:C:D FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
These off-diagonal elements are automatically extracted in the function below.
library(dplyr)
library(tidyr)
aliases <- function(fit) {
X <- model.matrix(fit)
t(X) %*% X %>%
as.data.frame() %>%
add_rownames("effect") %>%
pivot_longer(-effect, names_to = "alias") %>%
filter(value != 0, effect != alias) %>%
select(-value)
}
aliases(fit)
# A tibble: 16 × 2
effect alias
<chr> <chr>
1 (Intercept) A:B:C:D
2 A B:C:D
3 B A:C:D
4 C A:B:D
5 D A:B:C
6 A:B C:D
7 A:C B:D
8 B:C A:D
9 A:D B:C
10 B:D A:C
11 C:D A:B
12 A:B:C D
13 A:B:D C
14 A:C:D B
15 B:C:D A
16 A:B:C:D (Intercept)
So, revisiting the estimated fit, we can conclude that. \[\begin{align*} \widehat{\mu}+\widehat{A B C D} &=70.75 \\ \widehat{A}+\widehat{B C D} &=9.5 \times 2=19 \\ \widehat{B}+\widehat{A C D} &=0.75 \times 2=1.5 \\ \widehat{C}+\widehat{A B D} &=7 \times 2=14 \\ \widehat{D}+\widehat{A B C} &=8.25 \times 2=16.5 \\ \widehat{A B}+\widehat{C D} &=-0.5 \times 2=-1 \\ \widehat{A C}+\widehat{B D}=&-9.25 \times 2=-18.5 \\ \widehat{B C}+\widehat{A D} &=9.5 \times 2=19 \end{align*}\]
It seems like we should believe in the main effects \(A, D\), and \(C\). For the two-way interactions, \(AC\) and \(AD\) are more plausible, because the main effect for \(B\) was not significant.
The word \(I = ABCD\) emerged from defining \(D = ABC\).↩︎