Why are experiments run in the first place?
Readings 1.1, 1.2, 1.4, Rmarkdown
golf score = f(driver type, type of ball, ...)
In theory, we could manipulate these factors to see how they influenced golf score. If we considered only two factors at a time, each with two possible levels, this would be called a \(2^2\) design.
Mathematically,
\[ y = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{2} + \epsilon \]
\[ y = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{2} + \beta_{12}x_{1}x_{2} + \epsilon \]
because now the slopes can change depending on the value of the other factor.
\[ y = \beta_{0} + \left(\beta_{1} + \beta_{12}x_2\right)x_{1} + \beta_{2}x_{2} + \epsilon. \]
Can you write an expression showing how the slope for \(x_{2}\) depends on \(x_{1}\)?
Suppose we want to see how K different binary factors influence golf score. We can no longer visualize the effects as corners of a square, but we can still collect samples for each configuration of factors. This is called a \(2^K\) experiment.
A challenge is that for large K, this means collecting lots of samples
Sometimes we care more about optimization. In this case, we don’t care so much about how each factor influences an outcome; we just want a combination of factors that maximizes it. We can visualize the outcome of hte process as a function of several continuous variables.
Intuitively, our experimentation should proceed by first making a preliminary test and then proceeding in the direction of the max. This intuition is formalized in response surface methodology.