Thursday, August 23, 2012

The Trouble with Infographics

Infographics have become a common means of presenting information to people in an easy-to-understand visual format. But this ubiquity doesn't mean that an infographic always means what a viewer might presume it to mean at first glance. Consider the following map of the United States.

Recently, the National Oceanic and Atmospheric Administration released this, and a number of other maps, showing what the summer climate has been doing recently. Unsurprisingly, the media quickly picked up on the dramatic story. During an interview with Secretary of Agriculture Tom Vilsack, Kai Ryssdal, senior editor of public radio's Marketplace program, made the following observation, "This has been, as you know, the hottest summer on record in a lot places in this country." On NPR's news blog, "The Two-Way," their piece on the heat starts out with "The all-red map tells the story."

But does it?

If you look at the same data on the regional map, you see plenty of the nation sweated through a warmer than normal July. But the heat wasn't record-setting in any single given region. The “118” that shows up on the national map is nowhere to be found on the regional one.
The statewide map is different again. We can see even more of the variation in the average temperature for July. And we see that across Virginia, it was hot – the “118” re-appears, although not so hot as to push the entire southeast region of the NOAA map into the red. In fact the mountain west seems to have a higher average temperature. We can also see that the “Near Normal” and “Above Normal” temperatures dominate states in New England, the Gulf Coast, the Southwest and the West Coast. People in these areas may have been surprised to learn that the summer had thus far been unusually warm.
And when you boil the data all the way down to the divisional level, the red spreads out to scattered parts of the country, but we learn that parts of Washington (like the Puget Sound area), Oregon, California, Texas and Louisiana had below normal average temperatures during July. (Don’t worry; we got ours over the first few weeks of this month.)

The contiguous states are divided into a total of 344 divisions. In all, 17 of these divisions, spread out over 12 of the 48 states, experienced record high average temperatures last month. While that means that a lot of people were looking for ways to stay cool, especially when you consider that Chicagoland is in one of the record-setting districts, many parts of the country, while warmer than normal, avoided pushing into new territory.
The culprit is, of course, averaging. Both spatially and temporally.

As an example, I've created a simple chart that measures a fictitious "Salamander Index" over a span of 15 years. The area being measured is divided into five separate regions - and the orange line on the chart represents the average value of all of the regions for that point in time. By year 15, the average is at record levels, yet, as you can see, only the East Region is in record territory; all of the other regions had scored higher on the index than that in the past – in some cases significantly so. In fact, although it isn't immediately evident from the chart, the South Region (the violet line), which spends much of its time above the overall average, is at slightly below its average level, as across all 15 years, the South Region scores an average of about 31.8.
So it's important to remember that while infographics, especially simple ones, make data easily digestible, they don't always provide as accurate a picture as it might seem at first glance.


Keifus said...

I don't know man, that's a lot of words--couldn't you have reduced the post into a handy graphical form? (yar har)

The word for this, or one word for it anyway, is "granularity," and sometimes it matters. How finely do we chunk it up (and a related question, how many terms do we need)?

I wonder how the country looks for yearly rainfall, incidentally.

Aaron said...

If you want the rainfall data, it's also there. You can get it from here:[]=Nationaltrank&imgs[]=Nationalprank&imgs[]=Regionaltrank&imgs[]=Regionalprank&imgs[]=Statewidetrank&imgs[]=Statewideprank&imgs[]=Divisionaltrank&imgs[]=Divisionalprank&ts=ytd

And you're absolutely right, Granularity is the right word. I suppose I was having too much fun being long winded to think of it at the time. :)