A first look at activations in a deep learning model.
In the last lecture, we discussed the conceptual foundations of feature learning. In this lecture, we’ll see how to extract and visualize features learned by a computer vision model.
We will inspect a model that was trained1 to distinguish between photos of cats and dogs. We’ve included a subsample of the training dataset below – the full dataset can be downloaded here. From the printout, you can see that we have saved 20 images, each of size \(150 \times 150\) pixels, and with three color channels (red, green, and blue).
f <- tempfile()
download.file("https://uwmadison.box.com/shared/static/o7t3nt77iv3twizyv7yuwqnca16f9nwi.rda", f)
images <- get(load(f))
dim(images) # 20 sample images[1]  20 150 150   3par allows us to plot many images
side by side (in this case, in a \(4 \times 5\) grid).par(mfrow = c(4, 5), mai = rep(0.00, 4))
out <- images %>%
  array_tree(1) %>%
  map(~ plot(as.raster(., max = 255))) 
Figure 1: A sample of 20 random images from the dog vs. cat training dataset.
The array_tree function above splits the 4D array into a collection of 3D
slices. Each of these 3D slices corresponds to one image — the three channels
correspond to red, green, and blue colors, respectively. The next map line
plots each of the resulting 3D arrays
Next, let’s consider what types of features the model has learned, in order to distinguish between cats and dogs. Our approach will be to compute activations on a few images and visualize them as 2D feature maps. These visualizations will help us see whether there are systematic patterns in what leads to an activation for a particular neuron.
To accomplish this, we will create an R object to retrieve all the intermediate feature activations associated with an input image. Every time we call this object on a new image, it will return the activations for features at all layers.
# download model
f <- tempfile()
download.file("https://uwmadison.box.com/shared/static/9wu6amgizhgnnefwrnyqzkf8glb6ktny.h5", f)
model <- load_model_hdf5(f)
layer_outputs <- map(model$layers, ~ .$output)
activation_model <- keras_model(inputs = model$input, outputs = layer_outputs)
features <- predict(activation_model, images)features corresponds to a different layer. Within a single
layer, the 3D array provides the activations of each feature across different
spatial windows. For example, for the first layer, there are 32 features with
activations spread across a 148 x 148 grid, each grid element with its own
spatial context.dim(features[[1]])[1]  20 148 148  32plot_feature <- function(feature) {
  rotate <- function(x) t(apply(x, 2, rev))
  image(rotate(feature), axes = FALSE, asp = 1, col = brewer.pal(4, "Blues"))
}
ix <- 3
par(mfrow = c(1, 2), mai = rep(0.00, 4))
plot(as.raster(images[ix,,, ], max = 255))
plot_feature(features[[1]][ix,,, 1]) 
Figure 2: An image and its activations for the first neuron in layer 1.
par(mfrow = c(6, 7), mai = rep(0.00, 4))
out <- features[[2]][ix,,,] %>%
  array_branch(margin = 3) %>%
  map(~ plot_feature(.)) 
Figure 3: Activations for a collection of neurons at layer 2, for the same image as given above.
par(mfrow = c(6, 7), mai = rep(0.00, 4))
out <- features[[6]][ix,,,1:40] %>%
  array_branch(margin = 3) %>%
  map(~ plot_feature(.)) 
Figure 4: Activations for a collection of neurons at layer 6.