Sequential strategies for response surfaces.
Readings 11.1, 11.2, Rmarkdown
When we last discussed response surfaces (in Chapter 5), we showed how to fit a regression surface to a fixed set of runs. This let us find configurations of factors that optimized some response of interest.
However, the real power of response surfaces emerges when we think sequentially, using the results from one fit to plan a series of follow-up experiments, each bringing us closer to an optimal configuration of factors.
One approach is to divide experimentation into two phases. We can start with first-order models, and then proceed to second-order when in the vicinity of the optimum.
A first-order model is a linear model without any interactions or nonlinearities. It can be fit using relatively few samples; for example, unreplicated or fractional factorial designs. It is defined as, \[\begin{align*} y_{i} &= \beta_{0} + \sum_{k = 1}^{K}\beta_{k}x_{ik} + \epsilon_{i} \\ &= x_{i}^{T}b + \epsilon_{i} \end{align*}\]
where we defined \[\begin{align*} x_{i} = \begin{pmatrix} 1 \\ x_{i1} \\ \vdots \\ x_{iK} \end{pmatrix}, \end{align*}\] and \[\begin{align*} b = \begin{pmatrix} \beta_0 \\ \beta_1 \\ \vdots \\ \beta_{K} \end{pmatrix} \end{align*}\]
where we defined \(B\) as the symmetric \(K \times K\) matrix with diagonal entries \(\beta_{kk}\) and off-diagonal elements \(\frac{1}{2}\beta_{kl} = \frac{1}{2}\beta_{lk}\).
position = position_jitter(w = 0.5, h = 0.5)
. This prevents the few points sampled at the centerpoint from overlapping one another.library(ggplot2)
p <- ggplot(chem[[1]]) +
geom_point(
aes(time, temp, col = yield),
position = position_jitter(w = 0.5, h = 0.5),
size = 3
) +
coord_fixed() +
scale_color_viridis_c()
p
coded.data
function from the rsm
package to handle this.library(rsm)
chem_coded <- coded.data(chem[[1]], time_coded ~ (time - 35) / 5, temp_coded ~ (temp - 155) / 5)
FO
) model to these data.
Call:
rsm(formula = yield ~ FO(time_coded, temp_coded), data = chem_coded)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 40.444444 0.057288 705.9869 5.451e-16 ***
time_coded 0.775000 0.085932 9.0188 0.000104 ***
temp_coded 0.325000 0.085932 3.7821 0.009158 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.941, Adjusted R-squared: 0.9213
F-statistic: 47.82 on 2 and 6 DF, p-value: 0.0002057
Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
FO(time_coded, temp_coded) 2 2.82500 1.41250 47.8213 0.0002057
Residuals 6 0.17722 0.02954
Lack of fit 2 0.00522 0.00261 0.0607 0.9419341
Pure error 4 0.17200 0.04300
Direction of steepest ascent (at radius 1):
time_coded temp_coded
0.9221944 0.3867267
Corresponding increment in original units:
time temp
4.610972 1.933633
The output looks almost exactly the same as the output for lm
, except it also includes the direction of steepest ascent.
TWI
) and a quadratic component (PQ
). Neither of these terms seem to improve fit, so we’ll stick to the original first-order model. Compare the ANVOA with table 11.2.fit_iq <- rsm(yield ~ FO(time_coded, temp_coded) + TWI(time_coded , temp_coded) + PQ(time_coded, temp_coded), data = chem_coded)
summary(aov(fit_iq))
Df Sum Sq Mean Sq F value Pr(>F)
FO(time_coded, temp_coded) 2 2.8250 1.4125 32.849 0.00329 **
TWI(time_coded, temp_coded) 1 0.0025 0.0025 0.058 0.82132
PQ(time_coded, temp_coded) 1 0.0027 0.0027 0.063 0.81374
Residuals 4 0.1720 0.0430
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Path of steepest ascent from ridge analysis:
Path of steepest ascent from ridge analysis:
p +
geom_point(
data = ascent %>% select(-`|`),
aes(time, temp, col = yhat),
shape = 2
)
p +
geom_point(
data = ascent_long %>% select(-`|`),
aes(time, temp, col = yhat),
shape = 2
) +
geom_point(
aes(time, temp, col = yield),
position = position_jitter(w = 0.4, h = 0.4),
size = 3
) +
geom_point(
data = chem[[2]],
aes(time, temp, col = yield),
position = position_jitter(w = 0.4, h = 0.4),
size = 3
)
chem
contains the second experiment after following the path suggested by the method of steepest ascent.ggplot(chem[[2]]) +
geom_point(
aes(time, temp, col = yield),
position = position_jitter(w = 0.4, h = 0.4),
size = 3
) +
coord_fixed() +
scale_color_viridis_c()