Working for the marrow: a review of Info We Trust

Adapted from correspondence with the author.

Grad school has made me into a mercenary reader. My habit is to tear the meat off of a book, throw the bones back, and move on to the next assignment. This is not a satisfying way to engage with RJ Andrews’s book, Info We Trust. The book isn’t meat–it’s marrow. It’s rich, it’s rewarding, and it requires a lot more work from me to be nourishing.

Many data visualization books read like textbooks. Funny, personable textbooks, but textbooks all the same. Info We Trust is more of a meditation. Gentle explanations of chart types meander through a speculative history of the human perception of up and down. Best practices for table design emerge from a section about the history of bureaucracy. There is no neat delineation between background, body, and exercises: it’s all of a piece. I suspect that is the point.

Info We Trust tells a grand story about civilization. The first three chapters are a human history of information, connecting data work today to the world before electronic record-keeping. I’m not immune to poetry, and I have a long-standing (if often neglected) love affair with history. In my experience, history is spiky with context, competing interests, and strange accidents. The narrative presented by Info We Trust is so smooth and straightforward that I find it suspect. As a heroic epic, it works. As a history, I’m not quite willing to take it on faith.

However, that’s also part of what I enjoyed about the book. In the chapter on storytelling, Andrews wrote, “Great stories are rich with opportunities for the listener to make connections on their own. These self-made connections help the story leap off the page and into the reader’s imaginative reality. The more the story becomes alive in the reader’s head, the more meaningful the story becomes.” If it isn’t obvious that I struggled with this book: I struggled with this book! But that struggle brought it to life. I came face-to-face with what I thought I knew, where I was willing to listen, and my own biases.

The second half of the book is rich with opportunities for positive connections, particularly in the chapters on museum design, storytelling, engineering, and advertising. Andrews opens doors to unexpected worlds, allows me to make my own connections, and lets me find value in my own way. In the cathedral case study, I got to see him draw those connections, too. The “we” in the title is not just a figure of speech. I felt like I was sitting in conversation with Andrews throughout the book. The extensive marginalia presented alternate-universe versions of that conversation, where we split away from the main narrative to wander down a different rabbit hole.

Info We Trust is a generous and deeply human reflection on data. There is plenty of concrete advice about visualization, but it is woven into the narrative, not plucked out, polished, and ready for use. Nor should it be. The field has plenty of technical manuals. It doesn’t have anything quite like this.

My habit as a reader is to ask, what is this book trying to do? What is it going to teach me? Andrews flips those questions back around: what am I going to do with the book? How am I going to learn from it? Info We Trust is not a list of best practices, an in-depth history, or an immediate return on investment. However, it is a refresh on the craft, a feast for the eyes, and an opportunity to think deeply by drawing connections. I’m grateful for the chance to wrestle with this text, and I expect I’ll return to the mat soon.

Advertisements

Framing questions and crochet hooks

Interlibrary loan has reclaimed my copy of Visualization Analysis and Design, so I’m on to the next book on my shelf: Information Visualization: Perception for Design by Colin Ware.

I stand behind Ware’s position that data visualization is a tool for cognitive work, an external aid that shores up memory and pattern perception. Our brains need tools to think through complicated information, the same way our hands need tools to weave cloth. I can see the numbers in a spreadsheet, but interpreting them is like trying turn a pile of yarn into fabric with nothing but my fingers. A simple tool like a crochet hook radically extends what I can do with raw materials.

I do, however, struggle with the profit model introduced in the first chapter. Ware writes that learning to interpret new graphic symbols comes with a cost, and that novel designs should be used only when their benefits outweigh the cost of learning to use them.

Continue reading “Framing questions and crochet hooks”

Blown out of the blanket fort: beginning Tamara Munzner’s Visualization Analysis and Design

Confirmation bias is pernicious, but so is confirmation pleasure: the comfortable settling-in while reading one’s third or fourth introductory text on a subject, instead of reaching for something a little more challenging. It’s like reading a retelling of a fairy tale, or watching the hundredth episode of a procedural. When I open an intro-level book on data visualization, I know we’re going to talk about chartjunk, and axes that stretch to zero, and the concept of statistical uncertainty. I can snuggle into these subjects like a blanket.

The first chapters of Tamara Munzner’s Visualization Analysis and Design blew that smugness right out of me. Reading this book isn’t like wrapping myself in a blanket. It’s like climbing a rope ladder in a windstorm. I know how I got here, I can see where I’m going, and if I stretch, I can just reach the next rung. But I’ve been blown far from my comfort zone, and suddenly I can see that the horizon stretches way further than I expected.

All I can say is: good, and, more of this please.

Munzner’s book isn’t a how-to guide. It’s a framework for thinking about information visualization: all information visualization. I’m accustomed to thinking of vis as a means of communication. Munzner identifies communication as one of many possible goals, and develops a vocabulary that reaches across disciplines. When is a field not a field? When we have to consider continuous spatial data, that’s when.

While I’m busy rebuilding my fundamental understanding of information visualization, here are a few points that stuck out to me:

  • Visualization is an extension of human memory and information processing, but it’s also an extension of computer capacity. Graphics translate back and forth between a computer’s raw processing power and a human’s ability to pick out patterns that matter to other humans. And, like any translator, the grammar that a visualization uses when presenting information changes the way that information is understood. (It is a little bit refreshing to think of humans as an asset instead of a ball-and-chain for algorithms!)
  • The goal of visualization is not to optimize, it is to satisfy. When looking at design problems, don’t look at a narrow range of options and obsess over finding the very best one. Keep your eyes open to a wide variety of options and choose one of the many good ones. This goes counter to all of my instincts: I’m accustomed to thinking that if I spend just a few more minutes tinkering with formatting and color and font, I can beam my intentions directly into the brain of my audience. That’s a great way to fall down a rabbit hole of minutiae, and a poor way to keep my eyes open to the full scope of methods available to me.

The Why Axis: Searching beneath the streetlight

A drunk is carefully searching the ground beneath a streetlamp at night. A passerby asks what they’re looking for, and the drunk says, their keys. The passerby asks if the drunk lost their keys there, and the drunk says, “No, but this is where the light is.”

The streetlight effect (or streetlight problem, or drunkard’s search) is a common problem when working with data. What we really want to know hasn’t already been measured, or is very difficult to measure, or can’t be measured at all, so we substitute an easily accessible dataset to try to achieve the same goals. Is our answer in that dataset? No, but that’s what’s on Github.

I’m wrapping up my read of Picturing the Uncertain World with an exploration of the streetlight effect. The examples I’m pulling from the book are based on semi-serious asides, not genuine policy proposals, but I think they illustrate the streetlight effect and some associated dangers.

The passerby in the story has one big question for the drunk: is he looking in the right place for his keys? However, I think there are three questions here for a data analyst to consider: are we looking in the right place for the keys, are the keys what we should be looking for, and what happens when we find them?

Continue reading “The Why Axis: Searching beneath the streetlight”

An Accounting

Picturing the Uncertain World closes with visualizations created by the Jewish residents of the Kovno Ghetto. The visualizations were created as part of a community effort to record the Holocaust as it happened. There are bar and line and isotype charts of population changes, tables that summarize overcrowding, diagrams that show common illnesses and injuries in the ghetto. One line chart caught and held me: a display showing the population of the ghetto in September 1941 and again in November of the same year.

kovno chart

Scanned from Picturing the Uncertain World. More images here from the United States Holocaust Memorial Museum, and a full electronic exhibit on the Kovno Ghetto here.

The red upper line represents the population in September 1941, split into age groups across the bottom axis. The black lower line represents the population in November of the same year. The shaded region in between represents the people who died that autumn.

In the face of deprivation and death, the residents of the Kovno Ghetto did what all humans do: record. When moments are too big for our capacity to feel and understand, we spill them out into diaries, letters, conversations, and art. The Statistics Office in the Kovno Ghetto recorded their community in its entirety: how many people once lived there, and how many still survived.

Continue reading “An Accounting”

Visualizing Error

In Chapter 13 of Visualizing Uncertainty, Howard Wainer observes that the less precise an estimate is, the bigger its error bars, and the more visual prominence it has in a visualization. The more precise an estimate is, the less ink it gets.

He’s talking about images like this (chart from Visualizing Uncertainty, Chapter 13, annotation mine):

annotated error bar.jpg

In all my years of looking at graphs of confidence intervals, I had never put my finger on that point. Read on for a quick description of confidence intervals/standard error, and a suggested solution to the issue.

Continue reading “Visualizing Error”

The Why Axis: small sample sizes and too many slopes

I’m working my way through Picturing the Uncertain World by Howard Wainer, a collection of articles about dealing with uncertainty in statistical thinking and visualization. I could (and might!) write an essay about every article in the book. In this first post, I want to pull out two points that might be useful when analyzing or writing about data.

Picturing the Uncertain World is a series of real-world case studies, often with deeply-felt consequences. I’m collapsing a few articles together here, so to illustrate these points I’m going to use a fictional example with absolutely no consequences: swords in the made-up country of Knightlandia. Look for the bolded text if you’re only interested in the bottom line.

Continue reading “The Why Axis: small sample sizes and too many slopes”