Final Takeaways

Some major themes from STAT 479, in a nutshell.

Kris Sankaran (UW Madison)
2021-04-26

Recording, Rmarkdown

  1. We’ve covered many practical visualization strategies in this course. However, I hope a few overarching themes have come across as well. In these notes, I will try articulating a few of these themes, and they are also the subject of the readings reviewed in the next few notes.

  2. First, creating a visualization is in many ways like writing an essay. A great visualization cannot simply be summoned on demand. Instead, visualizations go through drafts, as the ideal graphical forms and questions of interest become clearer.

  3. Visualization is not simply for consumption by external stakeholders. In data science, visualization can be used to support the process of exploration and discovery, before any takeaways have yet been found.

  4. Visualization can be used throughout the data science process, from data quality assessment to model prediction analysis. In fact, I would be wary of any data science workflow that did not make use of visualization at multiple intermediate steps.

  5. It pays dividends to think carefully about data format. A design can be easy to implement when the data are in tidy, but not wide, format, and vice versa.

  6. Structured, nontabular data – think time series, spatial formats, networks, text, and images – are everywhere, and specific visualization idioms are available for each.

  7. Similarly, high-dimensional data are commonplace, but a small catalog of techniques, like faceting, dynamic linking, PCA, and UMAP, are able to take us quite far.

  8. Finally, data visualization can be enriching. Visual thinking has helped me navigate and appreciate complexity in many contexts. Through this course, I have shared techniques that I’ve found useful over the years, and I hope you find them handy as you go off and solve real-world data science problems.