Tracing the refinement of questions and design.
How does a visualization expert go about creating a data visualization? Perhaps the most useful lesson from this reading is that good visualizations don’t materialize out of thin air – there is always a creative process involved, steps where it’s unclear what the final result will be, even for a data visualization genius like Shirley Wu. We’re lucky that she has documented this process for us, so that we might be able to take away a few lessons for our own reflection.
Her visualization, “655 Frustrations of Data Visualization”, is based on an online data visualization survey. It had 45 questions ("How many years have you been doing data visualization? What percent of your day is focused on data prep work? …). There are 981 responses, probably mostly submitted by the survey initiator’s internet following.
Figure 1: A few entries from the data visualization survey. The full data are publicly available here.
Figure 2: Example exploratory displays of the ‘percentage of time’ questions on the data science survey.
Figure 3: The initial design used a bar chart to see whether experience was related to interest in further work in data visualization.
Figure 4: The faceted barchart did not add much information relevant to the guiding question.
Figure 5: A redesigned plot. The left and right panels separate respondents with and without frustrations, vertical position encodes job role, and color gives number of years of experience.
A few more refinements were made. Instead of placing those with and without frustrations far apart on the page, they were rearranged to share the same \(x\)-axis4. Also, instead of coloring circles by years of experience, color was used to represent the percentage of the day spent on data visualization. Again, these changes reflect sharpening of both design and questions.
In the final version of the static display, a boxplot was introduced to summarize the most salient characteristics of each beeswarm. Then, instead of just plotting the points in two parallel regions, they were made to “rise” and “fall” off the boxplots, depending on whether the respondents experienced frustrations. This kind of visual metaphor takes the visualization to another level; it becomes more than functional, it becomes evocative.
Figure 6: The final version of the static visualization.
Figure 7: The visualization above with interactivity added in.
Figure 8: A screenshot of notes from the designer’s exploration of the resulting visualization.
Wrapping up, the final visualization is clearly the culmination of substantial intellectual labor over the course of weeks (if not months). The result is both beautiful and informative. This is an ideal to strive for – the crafting of data visualizations that can guide discovery and change.
One final note. It’s often useful to study the development of projects that you find interesting. Sometimes, authors share their code on github, or earlier versions are available through technical reports or recorded talks. This additional context can shed light on the overall inspiration and intention of the project, and especially when starting out, imitation can be an effective strategy for learning.