Examples of 2^K Designs

Three case studies in using \(2^K\) designs.

Kris Sankaran true
11-04-2021

Readings 6.6, Rmarkdown

  1. Like the corresponding section in the book, these notes introduce no new technical material. Instead, they illustrate end-to-end analysis workflows for \(2^K\) designs and highlight the types of judgments that need to be exercised in practice.

Example 6.3

  1. An experiment was done to see how the “advance rate of a drill” varied as a function of four factors1 which we will call A, B, C, and D. To start, we code the variables A to D so that they are \(\pm 1\)’s instead of the characters + and -.
library(dplyr)
library(readr)
code <- function(x) ifelse(x == '-', -1, 1)
drill <- read_csv("https://uwmadison.box.com/shared/static/7l8bpcu36a12a8c0chlh4id0qezdnnoe.csv") %>%
  mutate_at(vars(A:D), code)
  1. It always helps to make a preliminary plot of the data. The two columns encode the value of \(A\) and the two rows encode \(D\). Informally, it seems that adding B and C both increase the response. The slope in the bottom right panel is also a bit steeper than the others, so perhaps there is an interaction between B and A. With four factors though, it’s hard to judge everything from just a figure, so the effect estimates should be a useful guide.
ggplot(drill) +
  geom_point(aes(B, rate, col = as.factor(C))) +
  scale_color_brewer(palette = "Set2") +
  facet_grid(A ~ D)

  1. As a first pass at the analysis, we fit a full \(2^4\) model. The Daniel plot is below. We use the same helper function as the last notes, and again we use the fact that the effects are twice the regression coefficients.
daniel_plot <- function(effects, probs = c(0.3, 0.7)) { 
  qq <- qqnorm(effects, datax = TRUE)
  qqline(effects, col = "red", probs = probs, datax = TRUE)
  text(qq$x, qq$y, names(effects), pos=1)
}

fit <- lm(rate ~ A * B * C * D, drill)
effects <- 2 * coef(fit)[-1]
daniel_plot(effects, c(0.35, 0.65))

  1. The absence of A among the large effects suggests dropping factor A in the fit. However, when we study the residuals, we notice they are heteroskedastic, with larger residuals associated with higher predicted values.
fit <- lm(rate ~ B * C * D, drill) # removed A from the model
drill_resid <- drill %>%
  mutate(residual = resid(fit), y_hat = predict(fit))
ggplot(drill_resid) +
  geom_point(aes(y_hat, residual))

  1. Since the data are rates, we take a log-transform. We refit the full model, which suggests a much simpler set of factors (B, C, and D), with no interactions. The residuals of the associated submodel also look much better now.

Lessons: * Examining residuals can motivate useful transformations of the data.

fit <- lm(log(rate) ~ A * B * C * D, data = drill)
daniel_plot(2 * coef(fit)[-1])
fit <- lm(log(rate) ~ B + C + D, data = drill)

drill_resid <- drill %>%
  mutate(residual = resid(fit), y_hat = predict(fit))
ggplot(drill_resid) +
  geom_point(aes(y_hat, residual))

Example 6.4

  1. An experiment was done to see how defect rate in airplane windows varied according to four factors: temperature (A), clamp time (B), resin flow (C), and press closing time (D). Our first step is exactly the same as before: download the data, code the variables, and plot the raw data. It seems that adding A increases defects and adding C removes defects, but this is a just a heuristic description.
windows <- read_csv("https://uwmadison.box.com/shared/static/62phufkeprheu9gu35mu1e75x6rc2shv.csv") %>%
  mutate_at(vars(A:D), code)

ggplot(windows) +
  geom_point(aes(A, defects, col = as.factor(C))) +
  scale_color_brewer(palette = "Set2") +
  facet_grid(B ~ D)

  1. Below, we estimate the effects and make a Daniel plot. The story seems simple,
    • Temperature (A) has a strong positive effect
    • Resin (C) flow has a slight negative effect.
fit <- lm(defects ~ A * B * C * D, data = windows)
daniel_plot(2 * coef(fit)[-1])

  1. Next, we refit the model with just these two main effects and then examine residuals. This reveals a kind of heteroskedasticity,
fit <- lm(defects ~ A + C, data = windows)
windows$residual <- resid(fit)
ggplot(windows) +
  geom_point(aes(x = B, y = residual))

  1. At this point, we don’t do any transforms, but instead recommend low temperature, high resin flow, and low clamp time (because lower clamp time -> lower variability)

Lessons: * In practice, it’s often useful to take variability into account, rather than just average response * A residual plot can be directly actionable

Dispersion estimates (Optional)

  1. The heuristic in the previous example can be formalized.
    • Let \(S^2\left(j^{+}\right)\) be an estimated standard deviation of responses when contrast \(j\) is active.
    • Theory predicts that \(\log\left(\frac{S^{2}\left(j^{+}\right)}{S^{2}\left(j^{-}\right)}\right)\) will be approximately normal.
    • We call these the dispersions
  2. Since the dispersions are approximately normal, it makes sense to create a QQ plot from them. This has the potential to highlight whether any factors have high discrepancies in spread, as a function of level. The code below loops over every factor (columns of the model matrix M), computes these dispersions, and makes a Daniel plot from them.
M <- model.matrix(defects ~ A * B * C * D, data = windows)[, -1] # remove intercept
S <- list()

for (k in seq_len(ncol(M))) {
  S[[k]] <- data.frame(
    "effect" = colnames(M)[k],
    "sd_plus" = sd(windows$residual[M[, k] == 1]),
    "sd_minus" = sd(windows$residual[M[, k] == -1])
  )
}

S <- do.call(rbind, S)
s_ratio <- setNames(log(S$sd_plus / S$sd_minus), S$effect)
daniel_plot(s_ratio)

Example 6.5

  1. A \(2^{4}\) experiment is setup to improve semiconductor manufacturing. The question of interest is: How do temperature (A), time (B), pressure (C), and gas flow (D) affect oxide thickness of the wafers? The code below downloads the data and reshapes it so that the oxide thickness variable is all in one column.
library(tidyr)
oxide <- read_csv("https://uwmadison.box.com/shared/static/vyk6uoe3zbnonv4n6jcusbrocmt4cvru.csv") %>%
  pivot_longer(starts_with("wafer"), names_to = "variable")

ggplot(oxide) +
  geom_point(aes(A, value, col = as.factor(B))) +
  scale_color_brewer(palette = "Set2") +
  facet_grid(C ~ D)

  1. In the design, four wafers are put in the furnace at a time. Note that these are repeated measures, not replicates! Therefore, we take the average of the wafers, and treat this as an unreplicated design. The block below computes the mean and standard deviation of the response across each configuration of factors A - D.
oxide_collapse <- oxide %>%
  group_by(A, B, C, D) %>% # isolate independent configurations
  summarise(mean = mean(value), var = var(value)) # take average and var. across groups
oxide_collapse
# A tibble: 16 × 6
# Groups:   A, B, C [8]
       A     B     C     D  mean    var
   <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
 1    -1    -1    -1    -1   378  2    
 2    -1    -1    -1     1   380 12    
 3    -1    -1     1    -1   372  6.67 
 4    -1    -1     1     1   378  1.33 
 5    -1     1    -1    -1   381  3.33 
 6    -1     1    -1     1   371  0.667
 7    -1     1     1    -1   385  0.667
 8    -1     1     1     1   376  0.667
 9     1    -1    -1    -1   416  0.667
10     1    -1    -1     1   415 14.7  
11     1    -1     1    -1   390  2    
12     1    -1     1     1   392 34    
13     1     1    -1    -1   448  3.33 
14     1     1    -1     1   446  6    
15     1     1     1    -1   430  8.67 
16     1     1     1     1   429  1.33 
  1. Next, we perform an analysis of the variation in average thickness across factor configurations. A look at the effects suggest that A, B, C, and the interactions AB and AC are likely nonnull.
fit <- lm(mean ~ A * B * C * D, data = oxide_collapse)
daniel_plot(2 * coef(fit)[-1])

  1. To fit the all the main effects together when the interactions AB, AC, we can use the formula ~ A * (B + C). This formula gets expanded into A * B + A * C, and each of the products expands into main effects and interactions (e.g., A * B = A + B + A:B). This captures all the effects we mention in the previous point.
fit <- lm(mean ~ A * (B + C), data = oxide_collapse)
summary(fit) # compare with Table 6.20

Call:
lm(formula = mean ~ A * (B + C), data = oxide_collapse)

Residuals:
   Min     1Q Median     3Q    Max 
-7.125 -2.469  1.000  2.250  6.625 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  399.188      1.049 380.475  < 2e-16 ***
A             21.562      1.049  20.552 1.64e-09 ***
B              9.063      1.049   8.638 5.98e-06 ***
C             -5.187      1.049  -4.944 0.000583 ***
A:B            8.437      1.049   8.042 1.12e-05 ***
A:C           -5.313      1.049  -5.063 0.000489 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.197 on 10 degrees of freedom
Multiple R-squared:  0.9839,    Adjusted R-squared:  0.9759 
F-statistic: 122.3 on 5 and 10 DF,  p-value: 1.237e-08
  1. We can plot the full response surface from the fitted model. The image method in the rsm (“response surface methodology”) package directly outputs this. For example, in the first plot, we see that the response is highest at A and B both equal to 1. The curvature in this surface also suggests that there is an interaction between these two terms.
library(rsm)
image(fit,  ~ A + B + C)

  1. In addition to modeling the average across wafers, we can model the standard deviation. This is potentially useful if we want to find configurations with more consistency in oxide thickness. The estimated effects for this model are shown below.
fit <- lm(var ~ A * B * C * D, data = oxide_collapse)
daniel_plot(2 * coef(fit)[-1])
fit <- lm(var ~ A + B * D, data = oxide_collapse)
coef(fit)
(Intercept)           A           B           D         B:D 
   6.125000    2.708333   -3.041667    2.708333   -3.625000 
  1. We can now use the two response surfaces jointly to determine factor combinations that will have a target oxide thickness, and low variability around that.
library(rsm)
image(fit, ~ A + B + D)

  1. Warning: What would have happened if we treated the repeated measures as true replicates? We would incorrectly include that many factors are relevant when they aren’t — this happens because our estimate of \(\sigma^2\) is too small. This can lead to lots of wasted effort.

Lessons: * Don’t treat repeated measures as replicates, or we risk many false positive effects * It can be useful to model the variance of the response, rather than simply the mean

fit <- lm(value ~ A * B * C * D, data = oxide)
summary(fit)

Call:
lm(formula = value ~ A * B * C * D, data = oxide)

Residuals:
   Min     1Q Median     3Q    Max 
    -6     -1      0      1      8 

Coefficients:
            Estimate Std. Error  t value Pr(>|t|)    
(Intercept) 399.1875     0.3094 1290.369  < 2e-16 ***
A            21.5625     0.3094   69.701  < 2e-16 ***
B             9.0625     0.3094   29.294  < 2e-16 ***
C            -5.1875     0.3094  -16.769  < 2e-16 ***
D            -0.8125     0.3094   -2.626   0.0115 *  
A:B           8.4375     0.3094   27.274  < 2e-16 ***
A:C          -5.3125     0.3094  -17.173  < 2e-16 ***
B:C           1.9375     0.3094    6.263 9.93e-08 ***
A:D           0.5625     0.3094    1.818   0.0753 .  
B:D          -1.9375     0.3094   -6.263 9.93e-08 ***
C:D           0.5625     0.3094    1.818   0.0753 .  
A:B:C        -0.1875     0.3094   -0.606   0.5473    
A:B:D         1.4375     0.3094    4.647 2.65e-05 ***
A:C:D        -0.0625     0.3094   -0.202   0.8407    
B:C:D        -0.3125     0.3094   -1.010   0.3175    
A:B:C:D       0.0625     0.3094    0.202   0.8407    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.475 on 48 degrees of freedom
Multiple R-squared:  0.9933,    Adjusted R-squared:  0.9912 
F-statistic: 476.8 on 15 and 48 DF,  p-value: < 2.2e-16

  1. The factors are drill load, flow rate, rotational speed, and drilling mud.↩︎