The Lamppost: On Deciding To Giving A Damn

I’m too new at data visualization to create a best-of-the-2018 list, and my only work resolution is to keep going. Hi! I’m here! I like graphs! Instead, I’m starting the year by reflecting on why I do this work. Why on earth do graphs matter so much to me?

I recently came across this quote from Brenda Euland’s So You Want to Write: A Book About Art, Independence, and Spirit:

“When Van Gogh was a young man in his early twenties, he was in London studying to be a clergyman. He had no thought of being an artist at all. He sat in his cheap little room writing a letter to his younger brother in Holland, whom he loved very much. He looked out his window at a watery twilight, a thin lamppost, a star, and he said something in his letter like this: ‘It is so beautiful I must show you how it looks.’ And then on his cheap ruled note paper, he made the most beautiful, tender, little drawing of it. When I read this letter of Van Gogh’s it comforted me very much and seemed to throw a clear light on the whole road of Art. Before, I thought that to produce a work of painting or literature, you scowled and thought long and ponderously and weighed everything solemnly and learned everything that all artists had done aforetime… but the moment I read Van Gogh’s letter I knew what art was, and the creative impulse. It is a feeling of love and enthusiasm for something, and in a direct, simple, passionate, and true way, you try to show this beauty in things to others, by drawing it. And Van Gogh’s little drawing on the cheap note paper was a work of art because he loved the sky and the frail lamppost against it so seriously that he made the drawing with the most exquisite conscientiousness and care.”

There are a lot of ways to make art, and a lot of ways to visualize data, and a lot of reasons to do both. But on my best days, my work comes from the same place as Van Gogh’s lamppost. I love my subject, and I want to share it. When I can’t love my data, I choose to respect it. Sometimes I respect the effort that went into gathering the data; sometimes I respect the subject that the data represents; sometimes I respect the truth in abstract and extend that respect to the subject at hand. Conscientiousness and care spring directly from that love and respect. How else can I treat something that matters to me?

Van Gogh loved the lamppost, but he also loved his brother. He drew the lamppost because he wanted to share it. The audience of a visualization matters to me, too. Sometimes I have a personal connection: I know who will use a visualization, and I want them to understand, because that understanding will answer a question or entertain them or help them make a decision. Sometimes the audience matters in an impersonal way. I’m asking for their time and attention–the least I can do is remember that I’m working for them, not for me.

Euland waves off long and ponderous thought, learning, and scowling. But these are also part of the process (especially the scowling). The best visualizations respect the reality of their source material and the capacity of their audience, and extend one to meet the other. Knowledge, skill, and experience make that extension possible.

There are times when respecting my data feels like hauling an anvil up a mountain. I can guarantee that I won’t always love my audience. (Who among us hasn’t made a graph out of spite?) Sometimes I play with form for its own sake, without an eye to audience. Sometimes I want to make something beautiful, even if it doesn’t say very much. But I try to come back to subject and audience, and the simple desire to share.

Prioritizing subject and audience helps to avoid the pitfalls of data visualization. Misrepresenting the data prioritizes an agenda ahead of my subject. Concealing the limitations of my data and the uncertainty in my conclusions prioritizes my ego. Carelessness with a data’s source and context prioritizes my ease. Splashy, opaque graphics use the audience for views and clicks, rather than prioritizing what they could learn.

Love and respect make care and conscientiousness natural. Treating data with respect makes for better visualizations, because it gives us a reason to get it right beyond the fear of getting it wrong. I do this work because the world is important, and I want to show it to you. I do this work because you are important, and I want to show you the world.


Mercy on our minds: lightening cognitive load with the known-new contract

Lawrence Evalyn and I have an interdisciplinary friendship. I study research methods and data visualization; he studies eighteenth century literature and the digital humanities. I taught him about pivot tables; he taught me about sentence stress. I still think I got the better half of that exchange.

Sentence stress is my favorite tool for writing about complicated topics. Communicating complexity is also my goal in data visualization, so sentence stress is a natural complement to a conversation about data storytelling.

According to the concept of sentence stress, every sentence has two parts: the topic position and the stress position. A sentence’s stress position establishes a sentence’s main idea. It always comes just before a full stop. For example, “When the pirates come over, we played board games” emphasizes the board games. “At our board game night, we played with pirates” emphasizes the pirates.

Continue reading “Mercy on our minds: lightening cognitive load with the known-new contract”

Framing questions and crochet hooks

Interlibrary loan has reclaimed my copy of Visualization Analysis and Design, so I’m on to the next book on my shelf: Information Visualization: Perception for Design by Colin Ware.

I stand behind Ware’s position that data visualization is a tool for cognitive work, an external aid that shores up memory and pattern perception. Our brains need tools to think through complicated information, the same way our hands need tools to weave cloth. I can see the numbers in a spreadsheet, but interpreting them is like trying turn a pile of yarn into fabric with nothing but my fingers. A simple tool like a crochet hook radically extends what I can do with raw materials.

I do, however, struggle with the profit model introduced in the first chapter. Ware writes that learning to interpret new graphic symbols comes with a cost, and that novel designs should be used only when their benefits outweigh the cost of learning to use them.

Continue reading “Framing questions and crochet hooks”

Sword Graphs Part II: Abstraction in Self-Encoding

In Sword Graphs Part I, I introduced the concept of self-encoding with this chart:


The graphic is self-encoded because the images themselves represent a value, rather than that value being translated into a mark like a bar or dot. Information about the length of the blade is represented by the length of the blade: the sword encodes itself.

But why not go a step further and show actual photographs of the swords, or a step fewer and use the same generic outline for all of them? The choice of images in self-encoding depends on specificity and processing speed.

Continue reading “Sword Graphs Part II: Abstraction in Self-Encoding”

Too many bees, not enough swarm

I’m working on a beeswarm plot about student loans, and I ran into a bit of trouble with my zero-debt group. If you aren’t familiar with beeswarm plots, take a look at this excellent example of gender ratios in newsrooms from Google Trends:

Screen Shot 2018-11-03 at 7.06.19 PM.png

Each newsroom is a dot. Left-to-right position indicates the gender balance of the newsroom, and bigger newsrooms have bigger dots. Up-and-down position doesn’t officially indicate anything, but because the dots don’t overlap, the width of the “beeswarm” is an informal indicator of how many newsrooms have a particular gender balance.

In my case, I’m looking at student debt-to-earnings ratios for graduates of career-training programs. That is, how much of a person’s income goes towards paying their student loans every year? Unlike the example above, I have many small beeswarm plots, since I’m splitting the data by occupation.

Here’s how the plot came out of R:


Note the 91-program pileup at zero. So, what to do?

Continue reading “Too many bees, not enough swarm”

Sword Graphs Part I: Self-Encoding

For your consideration, swords:


The sword graph was a bit of self-indulgent fun, but it did give me an opportunity to reflect on graph humor and the appeal of self-encoding. I created the term “self-encoding” to describe charts where the object being described represents (or encodes) itself, rather than being translated into a more abstract image like a bar or a dot. Self-encoding preserves important quantitative information (such as the length of a hilt) while also presenting additional qualitative information (the presence or absence of a pommel, the shape of the crossguard).

Sometimes self-encoding is just for fun. Consider these two classics of the genre:



(Original sources lost to the mists of the Internet, but found here and here.)

There is a certain “can’t-argue-with-that” charm to self encoding. They read like visual tautologies: the part of the chart that looks like a pyramid is encoded by the part of the chart that looks like a pyramid, the remaining pie is encoded by the remaining pie, the length of the blade is encoded by the length of the blade.

I will freely admit that my graph was a stab at comedy, rather than an attempt to communicate information about the British Museum’s collection of eighteenth-century swords. But self-encoding can be useful beyond the humor of unexpected juxtapositions. Take the self-encoded graphic of the lifecycle of a Japanese beetle:

Asset 1.png

(Originally printed in Man and Insects by L. Hugh Newman ,scanned from The Visual Display of Quantitative Information by Edward Tufte.)

The beetle’s position underground throughout the year is represented as, well, the beetle’s position underground. However, by portraying the beetle itself rather than a more abstract dot, line, or bar, the graphic communicates the creature’s size, positioning, and development throughout the year.

Self-encoded charts are closely related to diagrams: both communicate qualitative details while illustrating an organism or item. However, self-encoding goes a step further by arranging images in a way that facilitates data visualization tasks like comparison or the detection of patterns and outliers.

Self-encoded charts also have a surface resemblance to pictographs, but they take matters a step further. Take the following (entirely fictional) pictograph:

pictographAsset 1.png

The point of the pictograph is not that each potentially-sworded person has two arms and two legs and a mysterious floating head. These details make the icons instantly recognizable as people, but they’re superfluous to the quantity being shown. In a self-encoding chart, the details are the information being shown. In contrast, self-encoded images communicate some quality beyond quantity. One of the upsides of self-encoding is the ability to examine details that haven’t been directly measured. For instance, check out this 1864 diagram of river length, encoded by the actual rivers:


(Originally printed in Johnson’s New Illustrated Family Atlas with Physical Geography by Joseph Hutchins Colton)

The St. Lawrence River (the one with all the lakes) and the Niger River (directly to the right of the one with all the lakes) are very similar in length, but could not be more different in terms of intersection with other bodies of water. The main piece of information communicated by this chart is river length, but self-encoding also reveals river shape, tributaries, and settlements along the way.

Self-encoding for humor is inherently limited: it works when classical graphical elements are repurposed by encoding an image’s area, length, or position as area, length, or position. However, some of the benefits of self-encoding, such as quick recognition and intuitive understanding, can be recreated in surprising and serious contexts. In this diagram of increasing political polarization, the ideological distance between American political parties is shown as actual distance:

Screen Shot 2018-10-29 at 2.06.35 PM

(From “The Rise of Partisanship and Super-Cooperators in the U.S. House of Representatives” by Andis et al.)

This isn’t the plain-spoken humor of the sword graph: partisanship is a complex measure, subject to all kinds of transformation between observation and visualization. But the graphic makes instant intuitive sense by linking an abstract measure of distance with literal distance on the page, and showing the transition from muddy-colored cooperation into pure hues as the parties retreated further into ideological purity.

Self-encoding can dramatically increase a graphic’s information density in cases where one mark represents one sword (or one stage of a beetle’s life cycle, or one river, or one segment of actual pie). However, self-encoding also enforces a sort of information un-density. The technique is useful because it adds qualitative details. On the other hand, it is only useful when details are visible and recognizable, and therefore not suitable for trying to show a large quantity of data points.

Even when it is usable, self-encoding isn’t always appropriate as a tool. In the river example above, the kinks and turns of the rivers obscure their true lengths: someone who wanted to know precisely how long the Niger River is would have to turn elsewhere. The sword graph also only works because I could pick and choose between eighteenth-century blades. If I needed to include this curved sword, for instance, I couldn’t compare blade lengths by slapping a picture of it next to the straight-bladed swords. Like all visualizations techniques, self-encoding is useful for specific tasks with specific audiences.

Self-encoding also requires some careful choices around imagery: what is communicated when an image is simplified down to its most iconic form, versus when it is shown in photorealistic detail? Sword Graph: Part II will explore those choices by taking a dive into visual perception and comic books.

Having acknowledged the weaknesses of self-encoding, I can now acknowledge that I am completely charmed by it. Watch this space for more illustrations in unexpected places. And if you’re interested in the British Museum’s eighteenth-century swords (they’re all real, even the wiggly one!), you can find them here.

Interaction Without Interactivity

Last week I sat in on a guest lecture by Xaquín G.V., a visual editor at the New York Times. He showed a variety of interactive projects rich in hooks. One article from his time at the Guardian asked readers to create a stable coalition government by dragging and dropping political parties. Another interactive was a surprise at the end of an article about the gender pay gap, showing how much more money a man would have made than a woman in the time since the page was opened.

Screen Shot 2018-10-15 at 11.12.27 PM

Hook is an accurate term: as a reader, I immediately wanted to play with these visualizations. As a designer, I immediately wanted to make interactives like them. Unfortunately, I haven’t learned how to build interactive visualizations yet. So I started to wonder: how can I achieve a similar effect in static visualizations?

Continue reading “Interaction Without Interactivity”