Drawing conclusions from parameter estimates.
We will often want to know whether estimated main or interaction effects are significant. We can use ANOVA, though we have to be cautious when \(n\) is small.
The associated sum of squares is \[ \frac{1}{2^2 n}\left(\text{Contrast}\right)^2 \] for example, \[ SS_A = \frac{1}{2^2 n}\left[ab + a - b - (1)\right]^2 \]
The associated ANOVA decomposition is \[ SS_{\text{Total}} = SS_A + SS_B + SS_{AB} + SS_E \] and since the factors all have two levels, the df’s for the main and interaction terms are all 1. The df of \(SS_T\) is \(n 2^2 - 1\) (number of samples minus one). Taking the ratio between main and interaction \(SS\) terms and \(MS_{E} (= \frac{1}{4\left(n - 1\right)}SS_{E})\) gives the basis for \(F\)-statistics in the ANOVA table.
library(readr)
yield <- read_table2("https://uwmadison.box.com/shared/static/bfwd6us8xsii4uelzftg1azu2f7z77mk.txt")
yield
# A tibble: 12 × 4
A B Rep Yield
<chr> <chr> <chr> <dbl>
1 - - I 28
2 + - I 36
3 - + I 18
4 + + I 31
5 - - II 25
6 + - II 32
7 - + II 19
8 + + II 30
9 - - III 27
10 + - III 32
11 - + III 23
12 + + III 29
lm + aov
approach used in general factorial designs applies to \(2^{K}\) designs. The only difficulty is that the data were originally coded as +
and -
, and we need -1 and 1’s. For this, we’ve prepared a small function called coded
and used it to create new columns, cA
and cB
which convert those symbols into their numeric equivalents. From the ANOVA table below, we can see that both factors have strong effects, though A’s is stronger than B’s. We can also see that there is no detectable interaction effect, which is consistent with the plot from the previous notes.coded <- function(x) ifelse(x == '-', -1, 1)
yield <- yield %>%
mutate(cA = coded(A), cB = coded(B))
fit <- lm(Yield ~ cA * cB, data = yield)
summary(aov(fit))
Df Sum Sq Mean Sq F value Pr(>F)
cA 1 208.33 208.33 53.191 8.44e-05 ***
cB 1 75.00 75.00 19.149 0.00236 **
cA:cB 1 8.33 8.33 2.128 0.18278
Residuals 8 31.33 3.92
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Intercept
estimate (27.5) gives the response when all the factors are turned off. The cA
and cB
estimates (4.17 and -2.5) describe how the response changes when those factors are turned on. The interaction effect measures how the effect differs from the additive effect when both factors are active (it is 0.833 larger than what would be expected if we just added cA
and cB
).summary(fit)
Call:
lm(formula = Yield ~ cA * cB, data = yield)
Residuals:
Min 1Q Median 3Q Max
-2.000 -1.333 -0.500 1.083 3.000
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 27.5000 0.5713 48.135 3.84e-11 ***
cA 4.1667 0.5713 7.293 8.44e-05 ***
cB -2.5000 0.5713 -4.376 0.00236 **
cA:cB 0.8333 0.5713 1.459 0.18278
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.979 on 8 degrees of freedom
Multiple R-squared: 0.903, Adjusted R-squared: 0.8666
F-statistic: 24.82 on 3 and 8 DF, p-value: 0.0002093