Train a model for classifying pixels into different glacier types, using the pytorch package.
import urllib.request
import tarfile
from pathlib import Path
from data import create_dir, download_data
import os
# setup directory structure for download
= Path("/home/jovyan/data")
data_dir = data_dir / "processed"
process_dir
create_dir(process_dir)
# download processed data
download_data("https://uwmadison.box.com/shared/static/d54agxzb5g8ivr7hkac8nygqd6nrgrqr.gz",
/ "train.tar.gz"
process_dir )
lr
parameter below refers to the optimizer’s learning rate. The binder notebooks we’re running off of don’t have GPUs. If they did, we could set device: "cuda"
, and we’d be able to train the model much faster. We also had to limit the batch size, to avoid going over the memory limit imposed on these online notebooks.= {
args "batch_size": 1, # make this bigger if you are not running on binder
"epochs": 50,
"lr": 0.0001,
"device": "cpu" # set to "cuda" if GPU is available
}
DataLoader
object below. If you tried visualizing the items in the data loader, you would see the same image-label pairs from the previous notebook. This step might seem mysterious if you haven’t used a deep learning algorithm before. I’m deliberately avoiding an extended discussion on deep learning – my emphasis here is on visualization and earth observation. There are also many good references on applied deep learning already.from data import GlacierDataset
from torch.utils.data import DataLoader
= {
paths "x": list((process_dir / "train").glob("x*")),
"y": list((process_dir / "train").glob("y*"))
}
= GlacierDataset(paths["x"], paths["y"])
ds = DataLoader(ds, batch_size=args["batch_size"], shuffle=True) loader
Given a way of loading the training data, we can train our model. The parameters in the definition of the Unet
correspond to 13 input sensor channels, 3 output classes (clean-ice glacier, debris-covered glacier, and background), and 4 layers.
We can try running the model below, but it will not finish in the time for this workshop (Though, with a larger batch size and a GPU, it doesn’t take too long to converge.). We’ll instead download a model that I already trained earlier. This model was also trained using all the training patches, and not just those from the Kokcha basin.
import torch.optim
from unet import Unet
from train import train_epoch
= Unet(13, 3, 4, dropout=0.2).to(args["device"])
model = torch.optim.Adam(model.parameters(), lr=args["lr"])
optimizer
for epoch in range(args["epochs"]):
"device"], epoch)
train_epoch(model, loader, optimizer, args[
/ "model.pt") torch.save(model.state_dict(), data_dir