Characterizing optima in response surfaces.
so the function has a stationary point at \[\begin{align*} x_{\ast} &= -\frac{1}{2}\hat{B}^{-1}\hat{b}. \end{align*}\]
We’ll denote the value of the surface at this optimal point by \[\begin{align*} \hat{y}_{\ast} &= \hat{b}^{T}x + x^{T}\hat{B}\hat{x} \\ &= -\frac{1}{2}\hat{b}^{T}\hat{B}^{-1}\hat{b} + \frac{1}{4}\hat{b}^{T} \hat{B}^{-1} \hat{B} \hat{B}^{-1} \hat{b} \\ &= -\frac{1}{4}\hat{b}^{T}\hat{B}^{-1}\hat{b} \end{align*}\]
Just like in ordinary calculus, there are three possibilities,
Proof1: The textbook skips the proof of this fact, but if you have seen some linear algebra, the approach is interesting. First, plug-in the value of \(\hat{y}_{\ast}\) from above, and expand the definition of \(w\left(x\right)\), \[\begin{align*} \hat{y}\left(x\right) &= \hat{y}_{\ast} + w^{T}\left(x\right) \Lambda w\left(x\right) \\ &= -\frac{1}{4}\hat{b}^{T}\hat{B}\hat{b} + \left(x - x_{\ast}\right)^{T}U\Lambda U^{T} \left(x - x_{\ast}\right). \end{align*}\]
Now, use the definition of our original eigendecomposition \(\hat{B} = U \Lambda U^{T}\) and expand the quadratic, \[\begin{align*} \hat{y}\left(x\right) &= -\frac{1}{4}\hat{b}^{T}\hat{B}\hat{b} + \left(x - x_{\ast}\right)^{T}\hat{B}\left(x - x_{\ast}\right) \\ &= -\frac{1}{4}\hat{b}^{T}\hat{B}\hat{b} + x^{T}\hat{B}x - 2x^{T}\hat{B}x_{\ast} + x^{T}_{\ast}\hat{B}x_{\ast}. \end{align*}\]
To simplify, let’s plug-in the expression for the stationary point \(x_{\ast} = -\frac{1}{2}\hat{B}^{-1}\hat{b}\) that we found above, \[\begin{align*} \hat{y}\left(x\right) &= -\frac{1}{4}\hat{b}^{T}\hat{B}\hat{b} + x^{T}\hat{B}x + x^{T}\hat{B}\hat{B}^{-1}\hat{b} + \frac{1}{4}\hat{b}^{T}\hat{B}^{-1}\hat{B}\hat{B}^{-1}\hat{b} \\ &= \hat{b}^{T}x + x^{T}\hat{B}x \end{align*}\]
which was exactly our original definition of the second-order model.
Similarly, when all \(\lambda_{k}\) are all positive, then moving the \(w_{k}\left(x\right)\)’s by any amount can only increase \(\hat{y}\left(x\right)\). This means we’re at a minimum!
Finally, when some of the \(\lambda_{k}\)’s are positive and others are negative, then we’re at a saddlepoint.
library(readr)
library(dplyr)
library(ggplot2)
chem <- read_csv("https://uwmadison.box.com/shared/static/nbaj1m8j7tuaqmznjlrsgbzyhp9k61i8.csv")
ggplot(chem) +
geom_point(
aes(time, temp, col = yield),
position = position_jitter(w = 0.3, h = 0.3)
) +
coord_fixed() +
scale_color_viridis_c()
SO
to fit a second-order model. It appears that the linear and pure quadratic components are very strong, and that there is limited, if any, interaction between these two variables.library(rsm)
chem_coded <- coded.data(chem, time_coded ~ (time - 35) / 5, temp_coded ~ (temp - 155) / 5)
fit <- rsm(yield ~ SO(temp_coded, time_coded), data = chem_coded)
summary(aov(fit))
Df Sum Sq Mean Sq F value Pr(>F)
FO(temp_coded, time_coded) 2 10.043 5.021 70.814 2.27e-05 ***
TWI(temp_coded, time_coded) 1 0.250 0.250 3.526 0.103
PQ(temp_coded, time_coded) 2 17.954 8.977 126.594 3.19e-06 ***
Residuals 7 0.496 0.071
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
analysis <- canonical(fit)
analysis
$xs
temp_coded time_coded
4.305847 10.389230
$eigen
eigen() decomposition
$values
[1] -0.9634986 -1.4142867
$vectors
[,1] [,2]
temp_coded -0.9571122 -0.2897174
time_coded -0.2897174 0.9571122
temp time
176.52923 86.94615
contour
and segments
methods provided by the rsm
package.contour(fit, ~ time_coded + temp_coded, image = TRUE, asp = 1)
segments(stationary[2], stationary[1], w1[2], w1[1], col = "red")
segments(stationary[2], stationary[1], w2[2], w2[1], col = "red")
You can skip this proof without worrying whether it will appear in later lectures or assignments / exams. But I encourage you to go on, it gives a flavor of more advanced topics in statistics.↩︎