A short description of the post.
When we are cooking something up, we may want to find an optimal mixture of ingredients. For example, when making a fiber for a kind of yarn, there are three ingredients (polyethylene, polystyrene, and polypropylene) that are mixed at various fractions. What fraction would optimize the stretchiness of the yarn?
We’ll explore this problem as a special case of the general response surface problem.
The mixture setting induces specific geometric constraints.
Suppose there are \(P\) total ingredients.
Let \(x_{p}\) denote the fraction of ingredient \(p\).
Since the \(x_{p}\) are mixture fractions, we have the constraints,
\(x_{p} \in \left[0, 1\right]\)
\(\sum_{p = 1}^{P} x_{p} = 1\)
The set of \(x = \left(x_{1}, \dots, x_{P}\right)\) that satisfy these constraints can be geometrically represented by a simplex.
Center point has an equal amount of each ingredient
Corners have 100% coming from one of the ingredients
There is nothing stopping us from fitting a response surface over the simplex.
What design points should we use?
DesignPoints(SLD(3, 3))
DesignPoints(SLD(3, 5))
DesignPoints(SCD(3))
Simplex lattice design: Choose some \(m\) which will reflect the granularity of our design. Consider combinations of integers \(k_{p} \in \{0,1, \dots, m\}\) such that \(\sum_{p= 1}^{P} k_{p} = m\). Each such combination specifies a point \[\begin{align*} \frac{1}{m}\left(k_{1}, \dots, k_{P}\right) \end{align*}\] that is included in the simplex lattice design.
Simplex centroid design
Corners: Add all \(P\) permutations of the vector \(\left(1, 0, \dots, 0\right)\).
Edge midpoints: Add all \({P \choose 2}\) permutations of \(\left(\frac{1}{2}, \frac{1}{2}, 0, \dots, 0\right)\). These are midpoints between two corners, and so lie on edges of the simplex.
Face centroids: Add all \({P \choose 3}\) permutations of \(\left(\frac{1}{3}, \frac{1}{3}, \frac{1}{3}, 0, \dots, 0\right)\) which are the centers of faces defined by three corners.
.. continue the pattern: For all \(k \leq P\), add all \({P \choose k}\) permutations of \(\left(\frac{1}{k}, \dots, \frac{1}{k}, 0, \dots, 0\right)\).
There are some common variations,
It’s common to augment the designs above with center points.
Sometimes it is useful to include axial points, which are samples along rays extending from corners of the simplex
Computer-generated designs can be used, especially when there are constraints on feasible mixture values.
mscd <- SCD(5) %>%
mutate(id = row_number()) %>%
melt(id.vars = "id")
ggplot(mscd) +
geom_tile(
aes(x = variable, y = id, fill = value)
) +
scale_fill_viridis_c() +
coord_fixed() +
theme(
legend.position = "right",
axis.text = element_text(size = 8),
axis.title = element_blank()
)
We’ll use the yarn data from Example 11.5. The experiment used a (3, 2) simplex lattice design to measure variation in yarn elongation as a function of the fractions of different materials used to make the base fiber. First, let’s try to see the dependence visually, though direct visualization on the simplex is challenging.
yarn <- read_csv("https://uwmadison.box.com/shared/static/jghwbsnn6qjpwdr1lc97p9mbxk8qkwif.csv")
ggplot(yarn) +
geom_point(
aes(x = x1, y = x2, size = sqrt(x3), col = elongation),
position = position_jitter(w = 0.1, h = 0.1)
) +
scale_color_viridis_c() +
theme(legend.position = "none")
Now, let’s fit a second-order polynomial to the data. Note that we include a -1
term in the fit below, to ensure the model does not fit an intercept term.
Call:
lm(formula = elongation ~ -1 + (x1 + x2 + x3)^2, data = yarn)
Residuals:
Min 1Q Median 3Q Max
-0.80 -0.50 -0.30 0.65 1.30
Coefficients:
Estimate Std. Error t value Pr(>|t|)
x1 11.7000 0.6037 19.381 1.20e-08 ***
x2 9.4000 0.6037 15.571 8.15e-08 ***
x3 16.4000 0.6037 27.166 6.01e-10 ***
x1:x2 19.0000 2.6082 7.285 4.64e-05 ***
x1:x3 11.4000 2.6082 4.371 0.00180 **
x2:x3 -9.6000 2.6082 -3.681 0.00507 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.8537 on 9 degrees of freedom
Multiple R-squared: 0.9977, Adjusted R-squared: 0.9962
F-statistic: 658.1 on 6 and 9 DF, p-value: 2.271e-11
We can plot the associated fit. Compare with Figure 11.43.