Statistical Experimental Design: Following-up Two-Factor Fits

We’ll cover the analogs of performing multiple comparisons, and doing model diagnostics for the two-factor model. Almost everything will be familiar from our experience with single-factor models.

library(readr)
library(dplyr)
battery <- read_table2("https://uwmadison.box.com/shared/static/vmxs2wcsdxkdjujp85nw5kvk83xz4gl9.txt") %>%
  mutate_at(vars(-Life), as.factor)
fit <- lm(Life ~ Material * Temperature, data=battery)

Multiple Comparisons

We may want to use contrasts, to find out exactly how a particular factor is associated with the response.
- Subtlety: If the factor under investigation interacts with the other one, its effects will depend on that other factor.
- Solution: Fix a level for the other factor, and study the influence of levels for the factor of interest.
- Example: Fix temperature, and use Tukey’s HSD to study pairwise difference between materials, for that fixed temperature.

Figure 1: All pairwise contrasts between levels of one factor, restricted to a single level of another.

library(emmeans)
emmeans(fit, pairwise ~ Material | Temperature)$contrasts

Temperature = 15:
 contrast estimate   SE df t.ratio p.value
 1 - 2      -21.00 18.4 27  -1.143  0.4967
 1 - 3       -9.25 18.4 27  -0.503  0.8703
 2 - 3       11.75 18.4 27   0.639  0.7998

Temperature = 70:
 contrast estimate   SE df t.ratio p.value
 1 - 2      -62.50 18.4 27  -3.402  0.0058
 1 - 3      -88.50 18.4 27  -4.817  0.0001
 2 - 3      -26.00 18.4 27  -1.415  0.3475

Temperature = 125:
 contrast estimate   SE df t.ratio p.value
 1 - 2        8.00 18.4 27   0.435  0.9012
 1 - 3      -28.00 18.4 27  -1.524  0.2959
 2 - 3      -36.00 18.4 27  -1.959  0.1419

P value adjustment: tukey method for comparing a family of 3 estimates

Model Checking

Our key assumptions are independence, normality, and equal variances for the \(\epsilon_{ijk}\)’s. Our diagnostics are based on residuals.

battery <- battery %>%
  mutate(resid = resid(fit), y_hat = predict(fit)) # create two new columns
ggplot(battery) +
  geom_point(aes(Temperature, resid))

Plot residuals against each of the two factors

ggplot(battery) +
  geom_point(aes(y_hat, resid))

Plot fitted vs. residual value

qqnorm(battery$resid)
qqline(battery$resid, col = "red")

Make normal probability plots

Choosing the Sample Size

How should you choose how many replicates to have at each combination of the two factors?
- Simulate a model using a particular configuration of coefficients.
- See how your power to detect effects varies as you increase the sample size.
For example, in the block below, we simulate data from where the true \(\tau_i\)’s and \(\beta_j\)’s are \(\left(0, 1, 2\right)\) and \(\left(0, 1\right)\), respectively. We increase the number of samples from 2 to 10 and draw 50 replicates each. Fitting a linear model to each dataset allows us to see how the sample size influences our final \(p\)-values.

tau <- c(0, 1, 2)
beta <- c(0, 1)
ns <- seq(2, 10, by = 2)
b <- 1 # a counter variable
sims <- list()
for (k in seq_along(ns)) {
  for (sim_rep in seq_len(50)) {
    for (i in seq_along(tau)) {
      for (j in seq_along(beta)) {
        sims[[b]] <- data.frame(
            "factor_1" = i,
            "factor_2" = j,
            "sample_size" = ns[k],
            "replicate" = seq_len(ns[k]),
            "value" = rnorm(ns[k], tau[i] + beta[j]),
            "sim_rep" = sim_rep
        )
        b <- b + 1
      }
    }
  }
}
sims <- bind_rows(sims)

The block below visualizes an example replicate. We have 50 datasets like this for every value of \(n\).

ggplot(sims %>% filter(sim_rep == 1)) + # visualize only the first replicate
  geom_point(aes(factor_1, value)) +
  facet_grid(sample_size ~ factor_2)

Next, we fit a linear model to each simulation replicate and plot the associated \(p\)-values. The code to split the datasets and fit an lm to each is somewhat complicated – the important thing to takeaway here is that we can use a simulation to see exactly how the \(p\)-values decrease as a function of the sample size (and hence allow us to draw stronger conclusions).

library(broom)
library(purrr)
library(tidyr)

power <- sims %>% # get p.values and estimates from every single run
  split(list(.$sample_size, .$sim_rep)) %>%
  map_dfr(~ tidy(lm(value ~ factor_1 + factor_2, data = .)), .id = "group") %>%
  separate(group, c("sample_size", "sim_rep"), convert = TRUE)

ggplot(power) +
  geom_jitter(aes(sample_size, p.value)) +
  geom_hline(yintercept = 0.05, col = "red") +
  facet_wrap(~ term)

Interactions

If we only have one replicate per cell, then we can’t estimate an interaction effect. If we tried, we’d be able to perfectly fit the data, so there would be no way to estimate \(\sigma^2\).
As a general recipe, to check for interactions, we can perform residual analysis on the main effects model. Alternatively, we could use Tukey’s additivity test, which checks whether a multiplicative form of the interaction is present, i.e., does \[y_{ijk} = \mu + \tau_i + \beta_j + \gamma \tau_i \beta_j\] fit significantly better than the main effects model?