Foundations for Machine Learning

Notes for the morning session for day 2 of NAAMII’s Winter School on AI.

Learning Outcomes

Notebooks [1, 2, 3]

Given a problem description, determine features that may be relevant, including those directly present in the raw data and those that must be constructed.
Given a problem description, determine an appropriate response variable.
Design a model evaluation scheme for a specific problem / model context, keeping in mind the dangers of overfitting and the bias-variance trade-off.
Given a covariate / response pair, use visualization and summary statistics to identify preprocessing or transformation methods that may improve downstream model performance.
Discuss the relative merits of linear, sparse, and tree-based methods in a particular problem setting, and prepare code that implements them appropriately.
Use model summaries and data visualization to summarize the important features in a model and note areas for potential improvement.

Exercises from this session are available here