Kris Sankaran

Microbiome-inspired Methodology

Dissertation

We made an effort to strengthen links between visualization and statistical modeling, with the high-level view that both fields offer approaches to data compression for human consumption. It was awarded the Jerome H. Friedman Applied Statistics Dissertation Award and was chosen as Stanford's nomination to the Council for Graduate Studies Dissertation Award (Math, Physical Sciences, and Engineering).

Regime Detection

We survey a variety of algorithmic and probabilistic approaches to the problem of applying dynamic regimes in the microbiome, along with illustrative examples and code.

Description: arxiv preprint, poster
Code

Latent Variable Modeling

We applied probabilistic text modeling ideas to the microbiome, comparing methods and providing example workflows.

Description: peer-reviewed version , arxiv preprint, poster, slides
Code
Jane Austen and Microbes, an informal talk highlighting this work (press 'f' to advance slides)

centroidview

We developed an R package for simultaneous visualization of trees and heatmaps, based on linked brushing.

Code
Example

treelapse

An R package for interactive visualization of tree-structured time series, based on focus / context and linked brushing ideas.

Multitable Methods

We experimented with several perspectives for learning from multitable datasets, and offer some guidelines for their application to the microbiome context.

Article
Code

Perturbation Study

We studied the effects of colon cleanouts on the microbiome, applying novel methods to incorporate phylogenetic tree structure.

Microbiome Workflow

We described and provided code for a full workflow for microbiome data processing and analysis.

mvarVis

Using htmlwidgets, we developed an R package that generates interactive plots of standard multivariate analysis methods, via the FactoMineR, ade4, and vegan packages.

Poster
Code

structSSI

We created an R package implementing multiple testing procedures applicable to group and hierarchically structured data, and demonstrated their relevance in the microbiome setting.

Article
Code

Expository

Remembrances of States Past

This piece explains recurrent neural networks using interactive views of a one-dimensional example. It also highlights connections between sequential processing and statistical sufficiency, and probably has the silliest title of anything I've written (yet!).

Humanitarian AI

Community

We've been trying to create more opportunities for people to contribute to humanitarian AI projects. The term can be quite broad, which makes it important that work is coordinated across teams, to maintain coherence.

Workshops: I've been lucky to contribute to the organization of the AI for Social Good workshops at ICLR and ICML and the Computer Vision for Global Challenges workshop at CVPR. I'm on the program committees for Machine Learning for Development at NeurIPS and AI for Social Impact at AAAI.
Internships: Mila hosts internships specifically around humanitarian AI projects, and I've been fortunate to work with some really wonderful students through the program.
Reading group: We organized a reading group on this topic, to help students at Mila share the types of projects they were interested in.
Review materials: I created this summary and this spreadsheet as a reference for people who want to learn more about social applications of data science and machine learning.

Climate Change

From detecting leaks in methane pipes to accounting for uncertainty in future reservoir loads, machine learning has the potential to amplify a wide variety of (big and small) climate change mitigation and adaptation efforts.

Climate Change AI: This initiative includes researchers across continents and research domains, brought together by the vision of a low-carbon future supported by thoughtful machine learning research and applications. You should join the mailing list!
Review paper: This paper provides a structure for thinking about climate change and machine learning. The hope is that this taxonomy helps guide closer connections between researchers in mitigation and adaptation domains and methodological practitioners.
Applications: Aside from the CCAI initiative, our team at Mila has been investigating a few of the problems highlighted in the review, including predictive maintenance for wind farms, modeling extremes in time series, and communication of long-term climate change impacts .

Remote Sensing

Effective processing of aerial imagery is important for a variety of socially relevant applications, from conversation monitoring to crisis preparedness. We've worked directly on applications as well as pursued improved methodology.

Multiframe super-resolution: With a team from Element AI, we have experimented with approaches to align and fuse low-resolutions imagery, inspired by the observation that, while high-resolution data are often expensive, low-resolution views are often plentiful.
Foundational mapping: With a team from Intel AI for Social Good, we are working towards providing locations of bridges, to provide better foundational data for disaster preparedness planning.
Interactivity: Mapping data are often very heterogeneous, so even the best models tend to require human validation before being used in the field. This study considers some approaches to interactively refining preliminary outputs, using weak supervision on prediction masks.
Interpretability: Remote sensing models hold the potential to transform a few types of socially relevant monitoring efforts, from poverty prediction to deforestation tracking. These studies explore the use of Concept Activation to make these models more accessible to the domain experts who use them. See these demos (1, 2) and reports (1, 2).

Other Academic Collaborations

Stanford NEMS

We designed algorithms to detect cytokine mixtures using data from novel nanoscale sensors.

Stanford HIV Database

We adapted local FDR methods to perform inference of APOBEC mutations.

Article
R package including local FDR methodology.

Industry Projects

In the past, I've worked on industry consulting projects and internships.

Climate Corporation

As an intern, I implemented a bayesian approach rainfall disaggregation, in order to incorporate a large amount of lower-resolution data into the company's forecasting pipeline.