Data Journalism Archives - Nightingale | Nightingale | Nightingale The Journal of the Data Visualization Society Wed, 27 Aug 2025 15:16:43 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.1 https://i0.wp.com/nightingaledvs.com/wp-content/uploads/2021/05/Group-33-1.png?fit=29%2C32&ssl=1 Data Journalism Archives - Nightingale | Nightingale | Nightingale 32 32 192620776 Why Visual Journalism Is So Slow https://nightingaledvs.com/why-visual-journalism-is-so-slow/ Wed, 27 Aug 2025 15:16:38 +0000 https://dvsnightingstg.wpenginepowered.com/?p=24144 Ever wondered what consumes a visual journalist’s day? I tracked every hour of a typical project to uncover the biggest time drains and share insights..

The post Why Visual Journalism Is So Slow appeared first on Nightingale.

]]>
Ever wondered what consumes a visual journalist’s day? I tracked every hour of a typical project to uncover the biggest time drains and share insights on how to speed things up.

Visual journalism is slow—at least it seems so, compared with written journalism. While many written pieces take a day or two, visual stories can easily take weeks to produce. Meanwhile the news cycle is advancing a few rounds. By tracking the time on a sample project, three time sinks can be identified that contribute to the slower pace of visual journalism:

  1. Research takes longer than in written journalism
  2. It involves many more tools
  3. Visual journalists are also designers

A typical project

The examined specimen is a piece on the cramped living conditions in the Gaza strip. Its main visual is a 3D-illustration of rooms filled with people. This chart contrasts the living space for a typical Swiss person to the space available to someone living in Gaza. This centerpiece is complemented with a map showing what proportion of buildings are still intact, and how the South is filling up due to people fleeing. From pitch to publication, it took two weeks.

The three main charts produced for the article. (Image provided by the author.)

This piece is a perfect example for our experiment: the result is fairly standard for visual journalism, but uncharacteristically, it was mostly done by a single person. That made it easier to pinpoint where the time was spent. The two-week timesheet looks like this:

The article took two weeks from pitch to publication. The timesheet shows how long different activities took. (Image provided by the author.)

Research and analysis took up most of the time—four days in total. Out of ten days, I only spent seven working on this project. During the remaining three days, I worked on other projects, administrative tasks and an in-house training. Design and graphics production took two days, and the text just about a day.

Time sink 1: Research takes longer

Research is by far the most laborious part when producing a visual piece. Such articles inherently require more research than most texts to tell the story right. Take just one detail: it takes 10 seconds to write that, “people from north of the Gaza strip have fled to the South.” But it takes an hour to find out that the border that separates north and south is in the Wadi Gaza wetlands, and then even longer to find the geodata to show this on a map. Researching just a few such details quickly amounts to a day’s work. Simply because you want to show something rather than tell it.

Wadi Gaza separates the north from the south of the Gaza strip. (Image provided by the author.)

Even after locating the right information, it is often not in the right form to compare it with other pieces of information. In the Gaza example, I had estimates on the number of destroyed buildings. But what I was really looking for was the amount of destroyed living space. An apartment building in Gaza city represents a bigger loss of living space than a farmhouse outside of Rafah. So I had to make some educated guesses based on district population numbers. But to find and combine these numbers with the data on destroyed buildings took even more time. This is how I ended up taking four days just for research and analysis.

One way to speed up this work is through specialization. When journalists focus on the same topics for a while, they become familiar with the sources and data. This happened during the COVID-19 pandemic with data journalists. As they worked with the same datasets over and over, their productivity increased. Visual journalists, however, are usually few in number and often remain generalists, unlike other journalists who typically cover a limited range of topics or beats.

Another strategy is to advocate for more open data. We can lobby governments, organizations, scientists, and companies to provide more information and make it more accessible. This doesn’t just mean numbers—plans for machines or architectural models are just as valuable and often even harder to obtain.

Time sinks 2 and 3: Tools and redesign

To create the main chart, I used three tools. First, I modeled the rooms and figures in Blender. Then, I exported these models to Figma to add labels and create variations for different device sizes. Finally, I uploaded everything into Q, a graphics production tool integrated with our CMS.

I’ve used three tools to produce the 3D-illustration. Blender to illustrate the rooms, Figma to add text labels and our our own charting tool Q to integrate them into the CMS. (Image provided by the author.)

While exporting graphics between tools only takes a few seconds, these seconds add up as we make changes and iterate. This is especially true when we experiment with new representations, like the dot density plot turned room filled with people. A lot of iteration happened to decide on camera angles, colors and the figure poses. And with each iteration, the chart has to pass through all of the tools before we can see the final result.

For more established charts, like the map of the Gaza Strip, the process is shorter. We’ve done the map multiple times and many design choices are settled: the shades of gray for buildings, the width of roads and rivers, etc. So, the first remedy for a complex toolchain and eternal redesigns is standardization. At NZZ, we use a style guide and share templates to reduce the number of design decisions needed.

A more challenging, but also more exciting improvement would be better integration between design tools. Something similar to the live reload feature commonly found in HTML editors or interactive notebooks. If any researcher is up for the challenge, I’d be happy to talk!

What about division of work?

Work is often split between writing and graphics production. However, the timesheet shows that only one of seven days was spent writing. This highlights a common frustration: writers finish their half of the work and must wait for graphics to catch up.

A more effective approach is to split the research tasks among multiple people. After all, these take most of the time. Anecdotal evidence from our newsroom shows that this speeds up the process. However, research for visual pieces is different from that for written pieces, as visual pieces often require specific details that are unnecessary for the text. For example, researching that “people in northern Gaza fled to the South” is different from researching geodata indicating the border of the two. Therefore, not all journalists can immediately contribute research for visual pieces; it requires some training and experience in working together.

Conclusion

In summary, I propose five ways to speed up the production of visual pieces:

  1. Specialization: Visual journalists should have areas of expertise, especially for major events like COVID-19, the war in Ukraine, or the conflict in the Gaza Strip.
  2. Advocate for Open Data: Lobby for more open data, particularly non-numeric data such as airplane blueprints or architectural models.
  3. Use Templates and Style Guides: Implement templates and a consistent style guide to streamline production.
  4. Optimize Workflows: Develop workflows that better integrate various design tools and make iteration faster.
  5. Train Journalists: Educate other journalists on how to conduct research for visual pieces.

These improvements can make visual journalism more responsive to current events. However, even with these changes, it is unlikely to reach the speed of written journalism. This list is definitely far from conclusive and I’m happy to read your thoughts and experiences.

CategoriesData Journalism

The post Why Visual Journalism Is So Slow appeared first on Nightingale.

]]>
24144
Data About America’s Communities Are In Jeopardy, and Lives May Hang in the Balance https://nightingaledvs.com/data-communities-in-jeopardy/ Tue, 03 Jun 2025 14:49:16 +0000 https://dvsnightingstg.wpenginepowered.com/?p=23622 Over the years, I’ve worked with counties across California focused on combating drug overdose in their communities. My goal is to help local organizations leverage..

The post Data About America’s Communities Are In Jeopardy, and Lives May Hang in the Balance appeared first on Nightingale.

]]>
Over the years, I’ve worked with counties across California focused on combating drug overdose in their communities. My goal is to help local organizations leverage data to assess the impact of fentanyl and other opioids while communicating these findings to community leaders who can take action. In short, the aim is to use these data to save lives. 

However, there is one group with which I’ve worked and written about before for Nightingale—the Yurok Tribe in far Northern California—where there is no well of data from which they can draw. For many reasons, overdose data are not available for them to understand the deep impact of overdose on Native American tribal members. I remember Yurok Tribe members telling me that they were flying blind with no access to useful data and that they often don’t know about an overdose until one of their tribe members dies from it, when it’s obviously too late to provide supportive and life-saving services.

I fear we may be entering an era when many more communities across California and the country will be flying blind without access to data–and on a range of issues, not just the devastating impact of overdose. Among the swirl of changes taking place within the federal government these last few months, you may not have noticed that the availability of meaningful, community-level data is under serious threat. As staff across U.S. data-collecting agencies are let go (and with it, institutional knowledge is lost); budgets for data work shrink; and federal data advisory boards are disbanded, the capacity for the federal government to collect data, conduct surveys, and publish community-level findings could greatly diminish. 

We won’t notice these impacts immediately. After all, the Census Bureau, Centers for Disease Control and Prevention along with other federal agencies often take a few years to publish community-level data on poverty, crowded housing, nutrition, smoking, domestic violence, and suicide prevention, among many other topics. One person with deep experience managing federal data described to me recent developments with the U.S. data infrastructure as a slow rot, as if termites are, bit by bit, eating away at the foundation of federal data. So it may be years before we truly see the extent of this damage, and by then, it won’t be easy to simply reinforce the foundation with minor repairs.

Few would likely argue with the concept that we need detailed data—including from federal data systems—for the U.S. to compete effectively in a worldwide marketplace where companies, and countries, increasingly rely on data to get ahead. From the perspective of global competitiveness, deconstructing our federal data systems seems short-sighted. After all, to compete globally, we need current and reliable data, better breakdowns, and a greater capacity to interpret, visualize, and communicate meaningful findings.

But that’s the world stage. And for those of us in the visualization community, we soon may have less social good data to visualize, and innovation with public sector visualization could slow. Why, however, are data important to communities across America? There are countless ways in which individuals harness these data to save lives, build safer communities, and improve local well being:

  • Schools use data on reading and math proficiency, for example, to improve curriculum for our children
  • Local hospitals and county health departments examine government data about service delivery and health care conditions impacting the community, in order to improve medical care and provide preventative services
  • Adult kids seek Medicare data on the quality of nearby nursing homes for aging parents
  • Realtors increasingly share public data with clients about crime and the quality of life in neighborhoods to help people make informed decisions about where to buy a home
  • Many of us consult the local weather forecast each day—the federal government is a key source for this information, especially for tornadoes, hurricanes, heat waves, and other weather emergencies
  • And, as noted above, data are used by coalitions to help communities save lives by addressing the threat of overdose

These data are not bound by political lines. They benefit Republicans, Democrats, and independents alike. People of all political persuasions can, and do, make use of the treasure trove of data that the U.S. government publishes, often thanks to data translators that participate in the Data Visualization Society. 

Our local elected officials—county supervisors, the school board, town council members—rely on these data, too, for good governance and effective policymaking. And, of course, access to quality data helps us evaluate our politicians’ policy choices and keep them honest. 

In short, these data are vital to help communities thrive, and lives hang in the balance with decisions we make using these data. These data are not just numbers. They represent each of us and the communities in which we live, and we have every right to the high quality, detailed data for which we pay as taxpayers.

Actions you can take

So, what can you do as someone focused on data visualization about the threats to federal data? 

Be aware of the slow rot that we’re beginning to see in our federal data infrastructure. 

Use the data we have now while encouraging your community leaders to do the same. Maybe such usage will make it harder to take away these valuable resources. There are an array of data tools that leverage what’s available now from federal sources to provide summaries of how your community is faring on wide-ranging topics (I maintain a listing of roughly 100 such data websites). 

Join efforts to do something. National groups taking a lead role include the Association of Public Data Users, the Data Rescue Project, and the newly launched Federal Data Forum, sponsored by the Population Reference Bureau. For anyone in California who’s concerned, there’s a group of us now meeting to address the threats to federal data on our state’s communities, so you can join us. And other states could be, and maybe are, taking similar action. 

And let politicians on both sides of the aisle know that federal data are under threat in ways that harm all of us and could have lasting negative impacts for the communities our children and grandchildren will inherit.

CategoriesData Journalism

The post Data About America’s Communities Are In Jeopardy, and Lives May Hang in the Balance appeared first on Nightingale.

]]>
23622
The Art of the Trump Tracker: How Data Viz Can Combat News Fatigue https://nightingaledvs.com/art-of-the-trump-tracker/ Tue, 29 Apr 2025 14:48:13 +0000 https://dvsnightingstg.wpenginepowered.com/?p=23441 Since coming into office for his second term, President Donald Trump has issued more than 124 executive orders—with nearly as many lawsuits filed in response...

The post The Art of the Trump Tracker: How Data Viz Can Combat News Fatigue appeared first on Nightingale.

]]>
Since coming into office for his second term, President Donald Trump has issued more than 124 executive orders—with nearly as many lawsuits filed in response. Much like the early days of Covid, the data viz community is being asked to make sense of this tidal wave of new information. Data viz practitioners are responding to the challenge in different ways. 

The Impact Project and similar organizations are focused on mapping the regional impacts of federal employee firings and science funding cuts. The Washington Post put out an innovative choose-your-own-adventure flowchart and a calendar-based tracker of executive orders. 

As the months have dragged on, most news agencies have defaulted to standardized and easily-editable tables and drop-down menus by topic. Each news outlet is faced with the same dilemma: What information do you provide and how much of it? Every graphics editor has had to dust off their notes on cognitive load and come up with their own solutions. 

Most data viz practitioners will agree that less is more here. Give people the topic, a short description, and a simple status update. This is a tracker, not a full story, after all. Simple data viz designs ensure the tracker is easy to use and update—but is it actually easier to understand than a full story? 

For executive orders that have been blocked by a preliminary injunction or temporary restraining order, do you opt for concrete legal language or go with something more relatable like “blocked for now?” For lawsuits that have been filed, but no action has yet been taken by the courts, do you go with the literal “lawsuit filed,” or the more descriptive “awaiting decision/in progress.” And in the multiple cases where the Trump administration appears to be defying court orders, do you note that in some way? What about threats against judges? 

These are hard calls to make and I genuinely appreciated seeing AP be honest about this with a category on their tracker called, “It’s complicated.” In some respects, that uncertainty should be reflected in our data visualizations, lest we normalize what is an incredibly abnormal time in the United States.

In my own work at KUOW, NPR’s Seattle station, I sought to answer two simple questions for our audience: (1) How are Trump’s orders impacting people in Washington state and (2) What is the state doing about it? I made a conscious choice not to build this tracker for lawyers. They don’t need it. That means everything in my tracker had to come back to my top two questions without relying on legalese. I landed on three status options. At any point in time, a Trump executive order could be: In effect, partially in effect, or not in effect.

By limiting myself to the actual outcome of these orders, I was able to highlight how executive actions benefit from the slow pace of the court system. As cases wait in limbo, the laws that fueled them are in effect and impacting Americans. The “partially in effect” status allowed me to capture the messiness of these court battles. In the case of fired federal employees, two similar cases were being heard simultaneously. In one case, the court required fired employees in 19 states to be reinstated. In another case, the court required fired employees within 6 agencies to be reinstated. For Washington state, that meant some employees would be reinstated while others would not. 

Our newsroom also made the choice to add the tag, “Trump is possibly defying the courts” to our list of options. This was after significant discussion about the importance of presenting the news as it stands, uncertainty and all. Turmoil and all.

My Trump tracker has gone through a few iterations—and will no doubt continue to evolve—but for now, it serves as a landing post for our Washington state audience to understand how they intersect with these big national stories. The tracker consists of two graphics: (1) A quick filterable status list and (2) A detailed table with background information and links to our original reporting on these issues at home. You can visit KUOW’s Trump tracker here.

CategoriesData Journalism

The post The Art of the Trump Tracker: How Data Viz Can Combat News Fatigue appeared first on Nightingale.

]]>
23441
The Endless Stories in Baby Name Data https://nightingaledvs.com/the-endless-stories-in-baby-name-data/ Mon, 21 Apr 2025 14:57:11 +0000 https://dvsnightingstg.wpenginepowered.com/?p=23414 Throughout my time as a student I was used to being known with a last initial. My first name Emma has consistently ranked in the..

The post The Endless Stories in Baby Name Data appeared first on Nightingale.

]]>
Throughout my time as a student I was used to being known with a last initial. My first name Emma has consistently ranked in the top 5 for girls since 2002. Though my parents were slightly ahead of the curve in 1998, the ubiquity of my first name means I’m rarely the only Emma in a room.

I remember when my mom told me they considered Anastasia for both my sister and I and how my thoughts became transfixed to the parallel universe version of myself who shared a name with the lost Russian princess. In middle school I briefly flirted with the idea of having a more unique name, signing my emails with Esmerelda and Chrysanthemum. While my brief rebrand is mostly attributable to middle school-era awkwardness, the individuality I was searching for still reflects the broader obsession we have with baby names. People want their kids to be special, and a name is the first choice where parents can make that happen. 

“When people choose a name, it’s a form of self-expression, a way to establish a legacy and make a statement about your tastes and priorities by conforming to or bucking trends,” Elizabeth Cohen, a communications Professor at West Virginia University, told The Cut in 2019.

Comprehensive data from the Social Security Administration speaks to what the current and rising trends are, albeit with a slight delay. For the analyses here, I used the Social Security Administration’s bulk data, which is available at the state and national level. While SSA’s interactive website allows you to query the Top 1,000 names, this database contains every name with at least five babies born that year. The database divides the top names by male and female births, and I used Python to examine ranking trends, unique names, popular endings and more before creating final visualizations using a mix of Datawrapper and Adobe Illustrator. 

Emma’s resurgence came at a time when classic sounding names with alternating consonants and vowels were becoming popular again. Pop culture likely thrusted it into the top five too. In the final episode of Season 8 of “Friends”, released in May 2002, Ross and Rachel named their baby Emma. The name was #4 that year.

Emily also had dominated the #1 spot in the 90s and early 2000s, and its exit from first in 2008 ushered in a brief power struggle between Emma, Isabella, and Sophia. Emma eventually held onto the top spot for five years from 2014-2019, before Olivia bumped it down to second.

As a data journalist, I’ve covered a number of angles on baby names. I’ve contributed to stories on U.K. vs. U.S. naming trends, uniquely popular names in every state, influence of famous characters, etc. There’s an insatiable appetite for baby name stories.

The appeal on the data side is clear. The baby name database from the Social Security Administration is available at a far-reaching and granular level. It has data dating back to 1880 for each state, and every single name bestowed upon at least five babies is listed. It’s a huge scale of information available for free on a buzzy topic. 

Social media has inevitably amplified our fascination with baby names. One infamous meme shows an expectant mother crossing off the names Taylee, McKarty, Nayvie, and Maylee, landing on Lakynn. Facebook groups and subreddits have emerged where users share names with less than universal acclaim. TikTok has intensified this further with baby name gurus and trending name videos.

As the search for a one of a kind baby name has risen, the share of babies born with the top five names has fallen dramatically.

State level data reveals how much this varies across regions. For female births, the differences are most stark. Over one in ten girls born in Vermont, Rhode Island, and Wyoming in 2023 had a top 5 name whereas Texas, Mississippi, and Georgia reported the smallest share, below 4.6%.

As the gap shrinks between the top baby names and middle-ranking baby names, we’re also seeing a rise in the number of unique names. 

Namesakes have become less common compared to the 20th century when people were often named for relatives. According to a YouGov survey, more people have middle names after a family member than their first name.

Unique baby names can come from finding inspiration from objects or places. When I dug into data on the most distinctive names by state for a story, I found landmarks and cities like Atigun in Alaska and Sedona in Arizona influenced names regionally.

Notably, girls’ names have been much more subject to the unique naming trends. Researchers who have explored the uniqueness disparity have coined the term the “playground effect”. Parents tend to worry more about cruelty and bullying for their sons.

Additionally, marketing around baby name tips is still largely targeted towards mothers, and the same YouGov survey suggests that when only one parent chooses the name, it is more likely the mother. 

The share of unique names today is close to a level last seen in the late nineteenth century and early twentieth century. Early waves of immigration likely played a role here as the decline in unique names overlaps some with the drop in American children to foreign-born mothers in the early 1900s. When we talk about baby name data, the role of online trends is obviously a fun discussion point, but shifting name popularity also speaks to immigration and changing demographics. In my analysis on the most distinctive names in every state, trends in some states spoke to this. The top names in New York for example were ethnically Jewish and names with Arabic origins were featured in states like Michigan, which reports high Arab-American population relative to the nation.

Some of the surge in unique baby names is born from an effort to spell names differently from their traditions. The popularity of leigh names has become a poster child for this shift. While at one point it may have seemed like leigh names were everywhere, data shows other ly suffixes remained dominant even during leigh’s peak.

Ashley was a leading name throughout the 80s, and challenger spellings like Ashleigh and Ashlee also emerged. It wasn’t until the 2000s that the infamous leigh started to truly ramp up though. The rise of blogs and social media likely played a part here, amplifying alternatives like Everleigh or Kyleigh that otherwise may not have taken off. 

Given U.S. demographics, names that are more popular among white parents make up a higher share of total births. Trends in specific unique names can also be hard to pinpoint because of variable spelling and SSA’s five baby minimum. However, Leigh names, stereotypically associated with white mommy bloggers and with a common suffix, easily grouped together as a name fad, as well as target for online ire (see r/tragedeigh). 

Recently though, all kinds of ly suffixes have begun to decline.

One more key trend is the rise in gender neutral names. While this may be partially driven by some families eschewing gender norms for their children, the search for unique names for girls comes into play again too.

For Blake, the rising number of girls with that name coincided with a falling number of boys. Blake Lively may have helped normalize her name for girls. Time will tell if Lively’s daughter James will cause a similar impact.

In the case of Charlie, more girls are receiving the historically masculine name, but it still remains popular among both sexes. 

For names like Alexis it’s declining popularity among girls caused nearly the same amount of boys and girls to be born with that name in 2023. Azariah and Finley are seeing matching popularity for both male and female babies.

There can be immense pressure to naming a baby. For many people, this will be the name their children use when seeking jobs, dating, and going about their adult life. Even if we don’t want there to be, there are still preconceived biases that are associated with names. The market around baby names speaks to this general desire to balance the search for individuality with a realistic understanding of those factors.

For those of us who work with data, the shifting standards of how we name our kids provides a playground for exploring the pop culture influences, shifting demographics, and changing social standards behind baby name trends.

CategoriesData Journalism

The post The Endless Stories in Baby Name Data appeared first on Nightingale.

]]>
23414
The Hidden Toll: Microaggressions and the Impact on People of Color in the Workplace https://nightingaledvs.com/the-hidden-toll/ Fri, 04 Apr 2025 20:48:47 +0000 https://dvsnightingstg.wpenginepowered.com/?p=23361 Minority groups face microaggressions regularly in their day-to-day lives, but these incidents are most encountered in the workplace, often more than are documented. Many in..

The post The Hidden Toll: Microaggressions and the Impact on People of Color in the Workplace appeared first on Nightingale.

]]>
Minority groups face microaggressions regularly in their day-to-day lives, but these incidents are most encountered in the workplace, often more than are documented. Many in marginalized communities are taught to compartmentalize their experiences and only report the most blatant offenses. As a result, those committing microaggressions may not realize what they’re saying is problematic, and those reviewing the cases may assume these incidents are rare and deserve serious attention.

Microaggressions are subtle or indirect forms of discrimination against marginalized groups. They can be categorized into three types: microassaults, microinvalidations, and microinsults. While these behaviors are often difficult to recognize—and even harder to acknowledge when committed—their impact remains significant.

A 2007 study conducted by the National Institute of Health found that microaggressions can have long-lasting negative effects on mental health, job satisfaction, and career progression among marginalized groups. 

Research like this highlights the long-term effects of microaggressions in the workplace by showing how subtle, everyday interactions can accumulate over time and create significant emotional and psychological harm. While society has made strides in becoming more aware of these instances, later studies have continued to highlight concerning findings.

According to a 2019 study from The Journal of Applied Social Psychology, microaggressions in the workplace can decrease job satisfaction and increase turnover intentions among employees of color by more than 25%. A study in 2018 by the National Institutes of Health found that microaggressions experienced in the workplace can increase stress levels by 40% among people of color. Another report in 2020 by the American Psychological Association found that environments with unchecked microaggressions lead to decreased productivity and higher turnover rates, especially for marginalized employees. 

Despite progress in creating more inclusive workplaces for people of color, microaggressions remain widespread. Many individuals don’t realize they’re committing these subtle acts of discrimination, while others feel too afraid or emotionally drained to report them. Additionally, some people simply don’t prioritize or care about addressing these issues, allowing microaggressions to persist. These challenges highlight the need for continued efforts to create truly inclusive environments where such behavior is recognized and addressed.

A recent anonymous Instagram poll and questionnaire hosted on my page @garrulousgirl_ revealed some concerning findings about microaggressions in the workplace. When asked if they had ever experienced microaggressions at work, 89% of respondents said yes, while 11% said no. However, when it came to reporting these incidents, only 25% reported them, while 75% did not.

a bar chart representing data collected by the author during an Instagram poll

For those who did report, 10% said they reported after experiencing just one incident, 20% reported after 1 – 2 instances, and 70% waited until they had experienced three or more occurrences. These results highlight the widespread nature of microaggressions in professional environments and the challenges individuals face when deciding whether or not to report them.

The findings underscore that microaggressions are still very common experience in the workplace, with 89% of respondents reporting they’ve encountered them. However, the results also show a significant gap in reporting, as 75% of respondents chose not to report the incidents. Among those who did report, 70% waited until they had experienced multiple occurrences, suggesting that microaggressions are not only prevalent but often go unaddressed. This points to potential barriers, such as fear of retaliation or lack of support for those who experience them.

Participants of the Instagram poll and questionnaire shared personal stories of microaggressions they encountered in the workplace, varying from both microinvalidations and microinsults.

One Instagram poll respondent said, “Every time I got a new hairstyle, I was looked at like an alien.”

Another shared how they were repeatedly addressed with a term that felt dismissive or racially charged when coworkers called her by her name.

“I was continuously called ‘sister’ literally every day,” another Instagram poll participant said.

While intended as a compliment, it can feel dismissive or exoticizing, reducing a person’s cultural identity to something “other.” 

A study by the Workplace Bullying Institute published in 2017 found that only 33% of employees who experienced discrimination or harassment in the workplace chose to report it, citing reasons such as fears of being labeled as “difficult” or “sensitive” and doubts about the effectiveness of reporting systems. Furthermore, according to a 2016 report from The Journal of Business Ethics, 47% of employees of color report they feel that complaints about microaggressions are often dismissed or trivialized by management, which further discourages reporting.

Other participants in my Instagram poll and questionnaire shared that they felt the subtle hurtful language was also paired with the harmful and unjust treatment of work culture and promotions.

“I was at the job longer than others but was placed at the bottom of the tenure list,” said an Instagram poll respondent.

Others reported encountering race-based assumptions while navigating the marketing field, particularly when strategies were developed for different racial groups—even after speaking directly with a person of color.

“Black people may not like this or that,” an Instagram poll participant said.

The same participant shared feelings of defeat when their credibility was questioned and was ultimately told, “to do things ‘their’ way instead of suggesting my own.

a screenshot from the author’s Instagram poll

 “I had to fight for a seat at the table,” a poll participant said. “I had to display my value before being invited to meetings.” 

These personal stories illustrate how microaggressions can create a sense of alienation, exclusion, and frustration in the workplace. They also highlight the need for workplaces to foster an environment where these issues can be addressed, and employees feel empowered to speak up without fear of negative consequences.

In addition to more overt microaggressions, people of color often face subtle comments that are difficult to detail in formal reports because of their indirect nature. 

These remarks may seem harmless on the surface but can carry significant emotional weight. 

One common example is the seemingly well-meaning comment, “I have Black friends,” which many respondents reported hearing as if it somehow absolves others from bias or insensitivity. 

Hair-related microaggressions were very common. 

One participant said, “I was told my hair was frizzy when worn curly or that it was interesting”
“Your hair is so interesting,” one Instagram poll respondent said.

These types of microaggressions often go unreported because of their nuance, yet their cumulative impact can leave individuals feeling marginalized and uncomfortable in the workplace.

a shortened list of words flagged in order to limit or avoid, according to a compilation of government documents. (source: The New York Times)

President Donald Trump signed executive orders to roll back federal efforts aimed at promoting inclusive workplaces and discouraging diversity, equity, and inclusion (DEI) measures in the private sector, directly impacting offices that would typically protect individuals in such cases—even flagging more than 50 words surrounding race, ethnicity, gender and sexual orientation.

While offices like Title IX continue to fight for equity on campuses, the president of the United States has labeled such efforts “illegal,” creating further uncertainty about how to address workplace discrimination further making it difficult to not only bring awareness to inequities but to also file complaints against them.

These moves raised concerns about the future of workplace equality, as they directly threatened the resources and support systems that help create more inclusive and equitable environments. These shifts in the political climate have left many people of color and minorities questioning how to effectively challenge discrimination, including microaggressions.

The post The Hidden Toll: Microaggressions and the Impact on People of Color in the Workplace appeared first on Nightingale.

]]>
23361
Visualizing Meth Addiction: Data Reveals The Truth About Methamphetamine Addiction in The US https://nightingaledvs.com/visualizing-meth-addiction/ Wed, 23 Oct 2024 16:15:26 +0000 https://dvsnightingstg.wpenginepowered.com/?p=22303 Despite the lack of conversation, the abuse of methamphetamine remains an extremely serious problem in the United States. According to the CDC Wonder Database, in..

The post Visualizing Meth Addiction: Data Reveals The Truth About Methamphetamine Addiction in The US appeared first on Nightingale.

]]>
Despite the lack of conversation, the abuse of methamphetamine remains an extremely serious problem in the United States. According to the CDC Wonder Database, in 2023 alone, over 34,000 deaths were attributed primarily to methamphetamine as a psycho stimulant drug.

I spent over a year in treatment after my overdose in 2021. I had been using drugs the way a lot of people are using drugs now—mixing psycho stimulants and illicit depressants to achieve the so-called “perfect high.” I am now over three years sober but have lost five friends to drug-related deaths and countless others to relapse. Too many of those friends, people in their late teens and twenties, went back to using meth.

In my journey of recovery from a writer’s perspective, I have delved into the data. What percentage of people in the U.S. are addicted to meth? What kind of treatments are most prone to sufficiently helping those with meth addiction? Out of all the people struggling, how many of those seek treatment?

Among us

In some areas of the U.S., meth is an even greater threat than opioids, and because of its detrimental effects, it is the number one illegal drug that contributes to violent crime. According to the 2021 National Institute on Drug Abuse, more than 16.8 million people aged 12 or older used meth at least once during their lifetime. In 2021, an estimated 246 million people reported using meth in the past 12 months.

Fig. 1 – A bar chart showing the number of people who used meth in the past 12 months compared to the number of people who used meth at least once in their lifetime.

Institutions

When considering these questions, the first places I went to were the hospitalizations that in some way or another were due to meth use and abuse. There are a number of effects that will land a person in the emergency room—overdose, seizure, dehydration, drug-induced psychosis, withdrawal, and respiratory infections. The National Library of Medicine reports that the prevalence of drug related hospital admission varies from 1.3% to 41.3% with the average rate of 15.4%. Among hospitalized patients, 2.7% died due to drug-related problems.

Fig. 2 – A donut chart showing the drug related hospital admissions, varying from the highest to lowest and showing the percentage of deaths in those drug-related hospital admissions.

In the U.S. alone, drug overdose deaths tripled between 1990 and 2013, and in the first year after the COVID-19 lockdown 100,000 Americans died of overdoses.

Fig. 3 – A racing bar chart showing the growing overdose death rates by drug category from 1999 to 2022.

Even today’s world, where information about almost anything can be found with the click of a button, it takes hours of research with few sources, some almost illegible, to find information about the meth epidemic in the U.S. After digging for a while, I discovered The Wiley Online Library’s statistics about methamphetamine-involved cases in hospitalizations. In 2018 alone, out of 883 drug-related visits to the emergency room, 403 patients tested positive for methamphetamine. These emergency department visits in the United States are increasing yearly and with hardly any notice.

Fig. 4 – Donut chart showing meth-related emergency department visits out of all emergency department visits in 2018.

Monster drug

Meth is a different ball game when it comes to illicit drugs. It hijacks the way your mind works, exploding the pleasure receptors and, as writer David Sheff puts it, it makes you feel like a thousand roman candles have been lit inside your brain. It also induces extreme paranoia and for some, hallucinations. It slowly deteriorates teeth and heart valves and the damage done to the brain can be irreversible if you don’t stay clean early enough or for the long term. For a drug that inflicts so much damage, it’s all too easy to get, and for those that have that addictive gene, it hooks you immediately.

When I started going to CMA, Crystal Meth Anonymous, after months of frequenting mainly AA meetings, Alcoholics Anonymous, I saw that they gave out a chip for every month up to two years. In AA, you pick up chips for 24 hours 30 days, 60 days, 90 days, 6 months, 9 months, and then a year. From then on you only pick up another chip if you relapse or if you hit an anniversary. CMA was different. That’s because meth is different, it’s considered a miracle to stay off meth for days in a row, let alone years. When you first come into recovery, people in the 12-step recovery rooms of AA, NA, and CMA, will tell you to buy a suit or a black dress. They will tell you that the odds of staying sober for you and those you come in with are slim.

Trouble with treatment

In the United States, there are 994,000 people 12 years of age or older with methamphetamine use disorder, and that number is increasing.When it comes to drug addiction, the choice to go to a treatment center is an incredibly difficult one to make. Many people who need treatment don’t seek it for a number of reasons. I fought hard against rehab and almost died because of it. I didn’t want to go to treatment because I didn’t care that I had a problem, and I didn’t think life could be worth living without drugs. Approximately 40% of people suffering from addiction do not seek treatment because they don’t want to stop using.

Drawn figures of ten people, with four of them colored light purple and the other six a dark purple.
Fig. 5 – People chart showing 4 out of 10 people who suffer from Substance Use Disorder do not seek treatment.

That’s 4 out of 10 people.

A staggering 43% of U.S. adults who say they needed substance use or mental health care in the past 12 months were not given the resources they needed in order to receive that care.

This time last year, the Food and Drug Administration stated that we “critically needed to address treatment gaps,” because there is currently no FDA-approved medication for stimulant use disorder. Medication is not always the solution; however, as a lot of medications used for patients with SUD, can further the problem or inhibit a long-term solution.

12-step programs

12-step programs are free and widely available, especially in the U.S. Working certain programs like 12-step meetings during periods of sustained sobriety have been proven to increase a person’s chances of staying sober long-term. Studies have shown that people with one year of remission from heavy using were 41% more likely to remain in remission at five years if they had received addiction treatment after their first year of sobriety. Continuously working a 12-step treatment program through the fifth year of sobriety was also shown to improve the odds of staying clean long-term.

Bridget B. Hayes at ScienceDirect talks about how more than 50% of people with substance use disorders in the United States may currently have some months or years of sobriety under their belt. Unfortunately, relapse is common even during periods of sustained sobriety lasting 12 months or more. Just continuing to show up to 12-step meetings has shown me the dire reality of my situation.

12-step recovery programs like AA, CMA, and Al-Anon undoubtedly changed my life. Because I had the privilege of going to rehab and staying in a sober living for almost two years, I was able to focus solely on getting clean and staying clean. I was required by my treatment center to attend frequent meetings, take three weekly drug tests, live with strict rules, and most importantly, get a sponsor. That’s the true meat and potatoes of 12-step work. Working with someone who’s been where you have, now has a life worth living, and is a better person all because they worked with someone else who helped them through the 12 steps. It’s a cycle of life and service and it truly does work miracles.

"The Twelve Steps of Alcoholics Anonymous." These steps are listed in a structured format, outlining a 12-step program for overcoming alcoholism. The steps include admitting powerlessness over alcohol, coming to believe in a higher power, making a moral inventory, and making amends to those harmed. The focus is on self-reflection, spirituality, and personal responsibility, with each step intended to guide individuals through the process of recovery and maintaining sobriety. The text emphasizes humility, personal growth, and seeking help from a higher power as understood by each individual.
Fig. 6 – An image of the Twelve Steps of Alcoholics Anonymous from the AA website homepage.

AA has more than 4,956 groups with 84,558 members in the U.S. and 120,300 groups with 2,087,840 worldwide.

As effective as 12-step programs can be, more needs to be done for those who have fallen into the abyss of methamphetamine addiction. Whether it be through research or just attention and care, there has to be a solution, there has to be a way forward, there has to be a way out. In just the second chapter of Undoing Drugs by Maia Szalavitz, she talks about our current failing addiction treatment system. She also talks about a possible light at the end of the tunnel.

Sparking a discussion

Dr. Cara Poland, an associate professor at the MSU College of Human Medicine says, “it’s no longer an opioid epidemic, this is an addiction crisis,” in an article for The New York Times.

Overall drug use in the United States is consistently on the rise. Almost 32 million people have been actively using drugs as of 2021 with prescription stimulants, and methamphetamines being the most popular drugs of choice.

Fig. 7 – A bar chart showing the numbers for the most popular drugs among drug users in 2021, by drug category.

All this and yet there has been astonishingly little discussion about the meth epidemic in the U.S.. We see a bit of awareness here and there with memoirs like Beautiful Boy by David Sheff and Undoing Drugs, and very few TV series spotlight addiction in a real way (without going by the harmful stigmas); TV series like Shameless and Euphoria. These stories are incredibly visceral and real. David Sheff talks endlessly about the difficulty with the limited research and the trouble of coming across true and reliable statistics as well as the emotional turmoil and the damage that meth does to a person’s life and the lives of their loved ones. Maia Szalavitz is considered the first author to really delve into the topic of Harm Reduction in a narrative format.

Still, it seems that most of the country, for those not already involved with addiction albeit through a family member or a loved one, or themselves, has gone quiet, out of sight out of mind on this crisis. The fundamentally wrong mindsets, the stigmas, and the idea as Szalavitz puts it that people who are addicted to drugs that are deemed illegal become “a life unworthy of life”, needs to change.

CategoriesData Journalism

The post Visualizing Meth Addiction: Data Reveals The Truth About Methamphetamine Addiction in The US appeared first on Nightingale.

]]>
22303
‘The Data Diaries’: Making Interactive Data Visualizations about World Banknotes https://nightingaledvs.com/interactive-data-diaries-banknotes/ Fri, 03 May 2024 16:00:00 +0000 https://dvsnightingstg.wpenginepowered.com/?p=20510 The process behind building an interactive data visualization story about world banknotes from concept to finished visual.

The post ‘The Data Diaries’: Making Interactive Data Visualizations about World Banknotes appeared first on Nightingale.

]]>
April marked the second anniversary of publishing the interactive data visualization project that changed my career trajectory. To commemorate the occasion (and launch “The Data Diaries” series), I wanted to share the story behind the visual essay “Who’s in Your Wallet?” and how it took me from being a podcast producer to working full-time as a data visualization developer.

From observation to pitch

It was July 2021 and I had just heard of The Pudding , a digital publication that publishes data-driven visual essays on everything from Air Jordans to the world population .

The same month I found The Pudding, my home country of Peru announced some unexpected news: The country was changing its banknotes—and would now feature more women than men on its currency. Living in the U.S., where no women are represented on legal tender, I was curious how common—or rare—it was for women to be featured on money. I shared my initial curiosity with who would become my co-author, Eric Hausken, and we came up with more questions to explore. What does someone need to do to be on a banknote? Is there a common professional background or shared characteristic among them? Does the bill’s value correlate with the person’s historical importance?

While we found several sources of information about the people on banknotes, there was no up-to-date comprehensive dataset. We began gathering data from dozens of central banks to create a dataset that included 200+ banknotes in circulation (which was later featured in the “Data is Plural” newsletter).

GIF showing screenshots of central bank websites from Australia, Canada, Chile, Czechia and Indonesia

Meanwhile, we looked for articles on worldwide banknote design to see what was out there. We found three articles that served as reference points: one by Vox Media, another by National Geographic, and a third by money.co.uk. None included interactive elements, relying mostly on text and static visuals to tell their stories. This showed us that making an interactive visual essay could be appealing and provide something new for audiences.

Could this be a pitch for The Pudding? Turns out, their team had also been thinking about banknotes so they were happy to accept our idea.

From pitch to interactive essay

The process lasted around eight months, mostly because this was a side project to our full-time jobs. That was good though. This extended timeline encouraged us to verify the data and make multiple iterations.

During our data collection process, we also created static data visualizations that could serve as references for interactive ones. One of the most challenging was visualizing the time between the character’s death and their appearance on a banknote. The R graphic had two vertical axes, one showing the death date and the other showing the banknote issue date. It’s not evident in the screenshot because it only includes a few characters, but when we added all the characters, it looked like a spider web. That’s when interactive design came in handy. We used scrolling triggers so that different parts of the data would zoom in when a user scrolled through. In that way, all those messy lines could finally convey meaning.

Line chart showing the time between the death of a person and the year they appeared on a banknote. This was a visual design example so the data represented is not relevant.

Initially, we planned to focus on the data analysis and writing. However, following unplanned circumstances, I ended up designing it, too. While I had experience with graphic design and no-code web development, I had never used Figma or collaborated with a developer. I took the Google UX Design certificate to learn the basics of building low- and high-fidelity prototypes while making them for this project.

From concept to visual identity

A fun aspect of Pudding essays is that they all look aesthetically different. Each has a unique visual identity that supports the topic. That’s also a challenge because the designer has to create a design system from scratch. Our essay had an educational purpose so with that in mind, we opted for fonts and colors that made the project look academic without looking stiff. 

During the design phase, we learned there were legal constraints to having pictures of banknotes on our site. Some countries mandate images can only be reproduced at a reduced resolution and size while others require a watermark saying “sample” or “void.” These rules limited our banknote image use and pushed us in a new direction with our visual elements. We opted for headshots of banknote characters from Wikimedia. Some were in color and others weren’t, so we turned them all black-and-white and added a texture filter to make them look like banknote illustrations. After defining all the design guidelines, I created prototypes using Figma. As soon as our developer Jeff MacInnes finished building the first version, we shared it with prospective users and the Pudding editorial team to get feedback.

User testing highlighted the importance of having a specific color for men and women that would remain the same throughout the story. In that way, the reader could identify at first glance how many men and women were represented in the different graphs. The feedback also helped us rethink some elements that weren’t as user-friendly.

Initial design idea: Adding a sidebar character (Frida Kahlo) who would guide the user throughout the reading, with bubble texts sharing interaction instructions and additional relevant fun facts. 

Feedback: The character took away virtual real estate that could be used to enlarge the graphs. 

Final design decision: Add the interaction instructions below each graph title.

A screenshot of a chart title that says "Odds are the person on the banknote is a writer, not a politician"

Final thoughts

After several months of work and multiple versions, we published the essay in April 2022.  The project got overwhelming support: it was featured on Morning Brew and the Global Investigative Journalism Network and was long-listed in the Information is Beautiful Awards.

Now that I have more coding and design knowledge, there are many changes I would consider. I would use a scrapper to get the data from Wikipedia’s list of people on banknotes. Even if it doesn’t include all the categories we had in our dataset, it would make it easier to do regular updates. I would also add more visualizations, including a map to show the geographic distribution of the trends we found.

By the time we finished the project, I realized data visualization was a viable career path so I pivoted from podcasting to data. I was drawn to interactive data visualization because of its power to engage readers and showcase complex datasets in a digestible way.

For the next six months, I shifted all my energy toward improving my data analysis and design skills while learning front-end development (shout out to my local library for free Coursera access). Today, I have the pleasure of coding graphics for a living and using data and interactive storytelling to elevate local journalism across the U.S.

Have you made an interactive data project you’re proud of? Share your “Data Diary” with Nightingale! Read our submission guidelines here

CategoriesCommunity

The post ‘The Data Diaries’: Making Interactive Data Visualizations about World Banknotes appeared first on Nightingale.

]]>
20510
From Space to Story in Data Journalism https://nightingaledvs.com/from-space-to-story-in-data-journalism/ Fri, 22 Mar 2024 02:08:10 +0000 https://dvsnightingstg.wpenginepowered.com/?p=20466 The launch of Ikonos was one of a handful of developments that allowed newsrooms to expand from reporting on rocket launches and satellite hardware, to using remote sensing data as an essential tool to help tell stories.

The post From Space to Story in Data Journalism appeared first on Nightingale.

]]>
“Two weeks ago today, a satellite whirled above Washington on its way around the earth and shot photographs from 400 miles up that could change the way some people do business.”
— The Washington Post, October 14, 1999

Almost 25 years ago, The Washington Post reported on the first picture delivered by the brand-new Ikonos satellite. It was the first commercial imaging satellite capable of acquiring data that rivaled the resolution of spy satellites. Curiously, the reporter, D’Vera Cohn, failed to mention that one of the industries that would be changed by satellite imagery would be the news business itself.

The launch of Ikonos was one of a handful of developments that allowed newsrooms to expand from reporting on rocket launches and satellite hardware, to using remote sensing data as an essential tool to help tell stories. A wide variety of satellite data are now used to provide context to the news, to document events, and as a tool for investigation. 

A handful of factors combined with the advent of commercial high-resolution data to help make remote sensing a resource for journalists. In the early 2000s, data from government research satellites became widely available for no cost. This trend culminated in 2008 when the entire archive of Landsat data, which once cost thousands of dollars per image, was released for free. At the same time, advances in computers enabled rapid processing and storage of large datasets. Google’s Earth Engine and other cloud computing services allow types of analysis, especially of time series, that once required supercomputers. Finally, an ecosystem of free and open source software has evolved to supplement the boutique commercial applications that were once required to read the (sometimes esoteric) formats used to store and distribute remote sensing data.

Instead of reporting on a scientist’s research or claims made by an intelligence agency, reporters could now tell their own stories with this data.

How journalists use data

Although the boundaries are fuzzy, I think there are several distinct approaches journalists take with satellite data:

Context – imagery that helps readers understand the larger picture or background of a story.

Documentation – imagery that shows an event, like an explosion, or the result of an event, like damage from a natural disaster.

Investigation – imagery that is used to draw a novel conclusion.

Context

Satellite imagery used to show context functions like a locator map. It grounds the  reader and helps them absorb new information. Sometimes this takes the shape of a map tucked away in the corner of a layout, as a  base layer with other information composited on top, or even simply as a visual element meant to draw the eye.

The example above, from Quartz, uses an image of central South America from a weather satellite to show the extent of smoke over the Amazon during the 2019 fire season. The picture helps orient the viewer, and demonstrates how intentionally set fires have a continent-wide impact.

Satellite imagery can serve as an excellent basemap – a background layer with more detailed or prominent  information composited on top. This works especially well when the imagery is  idealized or simplified, as in this map of a highway through the Brazilian Amazon published in the Washington Post. The Post’s graphics reporters lightened satellite imagery; combined that with contextual map elements, including indigenous and protected territories, water bodies, and population centers; and highlighted the route of the highway through the rainforest. This careful layering of elements created a visual hierarchy, with the most important information in the image’s foreground, and supporting information in the background. The result is a map that’s easy to read at a glance, while providing detailed information when studied carefully.

Another way to use satellite imagery as a basemap is to remove color (which can be busy and distracting) entirely. In this map of the Gaza Strip Carl Churchill of the Wall Street Journal used a lightened copy of grayscale  Landsat data as a background layer. Colored dots and squares, representing different types of water infrastructure, stand out against the muted background – but it’s still clear where they are in relation to the roads and urban areas shown in the satellite imagery.

Satellite imagery doesn’t always have to be imagery. (In fact, there isn’t really much of a distinction between satellite “imagery” and satellite “data”.) Imaging satellites measure the intensity of discrete wavelengths of light (colors) that are then used to calculate properties of the Earth’s surface, like vegetation health or cloud cover. In this case, the New York Times’ Mira Rojanasakul used satellite data from the Copernicus Programme (likely Sentinel-2) to distinguish land from water along the southern reaches of the Mississippi Delta.

Using satellite data instead of a map allowed the Times’s visualizers to show fine detail in an ever-changing landscape. The use of a palette that exhibits gradations between land and water, rather than a hard boundary, conveys that this is a region of marshes and tidal zones that may be dry one day and wet the next. This is all in support of the larger story – of the potential encroachment of saltwater from the Gulf of Mexico into the water system of New Orleans. 

More abstract satellite data can also provide context to a story. Sea ice dictates the path of any journey through the Northwest Passage, so a map of ice extent was essential to illustrate the route chosen by the sailboat Polar Sun in the summer of 2022. The cartographers at National Geographic merged 83 days of sea ice data from NASA to give a sense of the challenging conditions the boat had to navigate through. Note the attention to detail – the rough look of the ice on the map matches the appearance of the ice mélange that floats along the edges of the Arctic ice pack. (Watch this recording of Soren Walljasper’s 2023 North American Cartographic Information Society (NACIS) talk to learn more about the making of this and other National Geographic expedition maps.)

Documentation

A further role served by satellite data in media is documentation. Imagery of a specific place and time that shows something happening – an unfolding conflict, the impact of a natural disaster, construction, or change on the Earth’s surface.

In many ways, the “killer app” for remote sensing is observing conflict zones. The first high-resolution images from space were collected by and for intelligence agencies, organizations that monopolized the field for decades. Satellites provide access to far-flung areas that are dangerous, inaccessible, or both. Imagery can be a dramatic view of an unfolding event, as in the plume of black smoke belching from a Saudi oil facility in the aftermath of Houthi drone strikes above (created by me). Or it can be more subtle, like the pictures showing the construction of trenches and other defenses in Russian occupied Ukraine, below.

Satellites are an unparalleled tool for showing landscapes before and after an event, or change over time. Landsat, Sentinel-2, and other monitoring satellites have a predictable orbit, and take images from the same perspective and time of day on a fixed schedule. Combined with precise calibration, this allows comparisons over time that can show trends. The sequence of true-color Landsat images below shows the shrinking of California’s Salton Sea after years of drought. Despite being collected over the course of two decades and by two different satellites, the data can be analyzed to accurately show the position of the shoreline (and, with the right analysis, other properties like water quality or the health of the surrounding fields).

Different types of imaging satellites have different strengths and weaknesses: in general, the lower the resolution the broader and more regular the coverage. Higher resolution satellites image smaller areas less frequently. In addition, very high resolution satellites (1 meter per pixel resolution and better) must be tasked – instructed to take a picture of a particular spot on Earth at a specific time. It’s important to plan ahead if you are trying to capture an event.

Despite these complications, high-resolution data is still a useful tool for analyzing the impact of events – in particular natural disasters like hurricanes, fires, earthquakes, and landslides.

Hurricanes Eta and Iota both struck the indigenous community of Haulover, Nicaragua within a span of two weeks. The storms destroyed much of the village, slicing through the narrow barrier island the settlement was located on. These two SkySat images, with a resolution of about 80 centimeters per pixel, show the village before and after the hurricanes struck. The powerful storms opened an inlet linking the Caribbean Sea (right) to the Laguna de Wouhnta (left) directly through the tiny village. The New York Times used the pictures to illustrate the impact of the storms on the largely indigenous community.

As with maps used to provide context, there’s no reason to be limited to showing only true-color imagery for documentation. There are thousands of different data products derived from satellites, describing properties of the Earth’s surface and atmosphere from sea surface height to ozone. In 2023, for the first time in decades, a significant snowpack persisted in California’s Sierra Nevada and Klamath mountain ranges deep into the summer. The Los Angeles Times used daily snow depth data from the National Operational Hydrologic Remote Sensing Center to compare early summer 2023 to 2022, illustrating the vast difference in snow cover. (This snow depth data isn’t purely from satellite measurements – it’s a type of assimilated data that blends orbital, aircraft, and ground-level data with physics-based mathematical models to give a seamless estimate of snow depth.) 

As with many rigid categories imposed on a messy real world (species, planets, gender  …) there isn’t a distinct line between using satellite data for “documentation” and data for “investigation”. BuzzFeed News’s work tracking the growth of Uyghur detention camps in China’s western Xinjiang Province is a case in point. Satellite data augmented clues discovered on Chinese web maps to uncover mass detentions.

The Buzzfeed reporters identified the location of camps by looking for blank spots on web maps from Chinese search provider Baidu. Some of these missing tiles coincided with the locations of known camps and military bases, but many others were near seemingly innocuous industrial areas. Checking recent satellite imagery available from Google Earth, Sentinel-2, and Planet; the team discovered hundreds of newly-constructed facilities that shared features with known Chinese prisons, and matched the descriptions provided by detainees. As a reward for “clear and compelling” journalism, the stories won the 2021 Pulitzer Prize for International Journalism.

Investigation

To me, the most exciting use of satellite imagery in journalism is for investigative reporting – data as a research tool, used to make discoveries and draw inferences. One early and innovative example came from Reveal News in the story “Who is the Wet Prince of Bel Air? Here are the Likely Culprits”. The reporters – Michael Corey and Lance Williams – used a combination of techniques to identify the largest residential users of water in Los Angeles during the California drought of the mid 2010s. (State water agencies released a list of their largest water users, but could not share names or addresses.)

A measure of vegetation health called the Normalized Difference Vegetation Index (NDVI) helped identify properties in Los Angeles with large expanses of lush greenery. The vegetation measurements were derived from National Agriculture Imagery Program (NAIP) data, a free source of high-resolution aerial and satellite imagery refreshed every few years. This was combined with estimates of soil moisture from Landsat data, which is lower resolution than NAIP but provides information in additional wavelengths. The combined datasets gave more reliable estimates of water use than either technique used alone. To reduce the uncertainty further, Reveal even looked at the proportion of grass, trees, and shrubs on each property.

The result? A list of locations, each annually consuming millions of gallons of water, and an investigation by the Los Angeles City Council. An example of investigative journalism having an impact on the actions of local government.

I’ve already mentioned that a primary use of satellite data is to be able to monitor inaccessible locations. A great example of this is Bellingcat’s efforts to track illicit shipping of grain from the port of Sevastopol in occupied Crimea, Ukraine. In addition to being in a war zone, ships docking in Sevastopol often turn off their Automatic Identification System (AIS) transponder – effectively hiding their location. By obscuring their movements, ships can evade sanctions on exports from Sevastopol and transport stolen grain.

As with Reveal, the Bellingcat team combined multiple types of data to track hidden activity for the story “Grain Trail: Tracking Russia’s Ghost Ships with Satellite Imagery”. They used commercial high- and very-high resolution optical PlanetScope and SkySat imagery from Planet, plus open access medium-resolution Synthetic Aperture Radar (SAR) data from Sentinel-1. The Planet imagery revealed a ship docked at the Avlita grain terminal on more than 100 days in the year following the Russian invasion of Ukraine, despite incomplete coverage and frequent cloudiness. Sentinel-1 SAR data analyzed with the Ship Detection Tool (a machine learning algorithm run on Google Earth Engine) determined there was a ship present at the terminal on more than two dozen additional days. SAR can penetrate clouds, but is not available as frequently as Planet’s optical imagery, so even the combined dataset is likely an undercount.

Bellingcat reporters augmented the satellite data with photographs of the Avlita grain terminal in Sevastopol, and the Bosporus Strait that links the Black Sea with the Mediterranean. This “ground truth” information helped the researchers identify and track the individual ships spotted in Crimea. The combined datasets reveal the larger scope of illegal grain shipments in a way that is more comprehensive than any of the techniques alone.

Like Bellingcat, the New York Times used a mix of ground-based evidence, satellite data, and machine learning to monitor illicit activity. But instead of monitoring the motions of ships through time, the Times’s staff mapped unregistered airstrips across the Brazilian Amazon. They then analyzed additional satellite data to document illegal mining that occurred near the airstrips, and tracked aircraft delivering supplies.

Another example of researchers using machine learning and satellite data to detect illegal activity  is  “Myanmar’s Poisoned Mountains” by Global Witness. Since they’re advocates and not journalists they don’t quite fit, but I think the story of the growth of illegal rare earth mines along Myanmar’s border with China is one worth reading.

One of the more creative uses of satellite data I’ve seen is an analysis of the flight of the Chinese surveillance balloon that passed over Canada and the United States in early 2023. The story started with a machine learning approach similar to those I’ve already described, which was used to locate the balloon over North America and then track it back to Hainan Island, China. But that left an outstanding question – was the path of the balloon driven solely by wind currents? Or was it being actively guided? With no known source of propulsion, the only way to steer the balloon would be to adjust its altitude until it was carried along by favorable winds.

The Time’s Visual Investigations team took advantage of a quirk present in most satellite imagery – each color is collected at a slightly different time – to determine the balloon’s altitude. (You may have noticed rainbow planes while browsing Google Earth or similar satellite-driven map. The phenomenon is similar, except there’s additional spacing between each color due to an aircraft’s high speed.) Essentially, by knowing the speed and altitude of the satellite, and the elapsed time between each picture, they could estimate the balloon’s altitude with trigonometry. They concluded the balloon was, in fact, being guided – at least over some of its journey.

Pro Publica is well known for their deep dives into American politics, but they also report on a wide range of environmental issues, often with the help of remote sensing data. Their series on locations at risk for future Ebola outbreaks combined investigative reporting with original scientific research. The articles uncovered how the fragmentation of forests around networks of villages and towns in Equatorial Africa correlated with known outbreaks of Ebola, and identified places where the disease may next spill over from wildlife to humans.

The series combined satellite data – long-term records of changing global forest cover and settlement maps – with pattern-finding algorithms, calculations of forest fragmentation, cloud computing on Google Earth Engine, epidemiological models, consultation with scientists, and interviews with the people of Meliandou, Guinea who survived the worst Ebola outbreak in history. Their conclusion is not just a warning for at-risk communities, but also a set of recommendations to reduce the likelihood of  future outbreaks.

Most of my examples have shown reporting in far-flung locales (at least from my perspective in San Francisco’s tech industry), which is one of the primary strengths of satellite data. The data journalists at texty.org.ua, however, had to deal directly with tragedy and trauma when Russia invaded Ukraine in early 2022. They responded with some of the most detailed reporting on the impact of the war I’ve seen, despite working during blackouts and while sheltering from air raids.

Texty used multiple types of data to cover the war – including high resolution commercial imagery, night lights data, and NASA fire locations. Combined, the datasets give civilians whose lives have been upended by the invasion a means to investigate and respond to the tragedy in Ukraine that has been forced upon them. The stories reflect their interests and priorities.

Approaches for using satellite data in the newsroom

The use of satellite data is rapidly increasing in journalism, a trend fueled by growing availability, higher quality, and the development of more usable analysis tools. What does it take to successfully use this data to tell stories in a newsroom, and develop innovative reporting?

Teamwork: It’s difficult for a single reporter to have the wide range of skills necessary to fully exploit the potential of satellite data. Teams with expertise in a range of fields – investigative reporting, writing, design, programming, and data analysis – are conducting the most novel and impactful data journalism.

Data literacy: Satellite data comes in many forms, suitable for a wide variety of applications. Knowing what data is available, and the strengths and weaknesses of each type, is essential for using it effectively.

Outside experts: The field of remote sensing has thrived for over 50 years. In that time scientists and technicians in government, academia, and industry have developed techniques to derive insights from data. They’re an invaluable resource for both background information and innovative new ideas.

Local knowledge: Data collected from a few hundred miles above the Earth’s surface is often limited when used in isolation. It is far more reliable when combined with in-situ data, augmented by on-the ground reporting, and (perhaps most importantly) informed by the perspective of the people who live in the areas being imaged.

Over the past ten years satellite imagery has become an important component of data journalism. In the next ten it will likely evolve further, from a tool used primarily to illustrate stories to one that is an integral part of research and  investigative reporting. I’m excited to see how reporters develop innovative uses of existing datasets, and explore new types of data.

The post From Space to Story in Data Journalism appeared first on Nightingale.

]]>
20466
Four Reasons to be Optimistic about Data Journalism in 2024 https://nightingaledvs.com/four-reasons-to-be-optimistic-about-data-journalism-in-2024/ Thu, 04 Jan 2024 14:05:06 +0000 https://dvsnightingstg.wpenginepowered.com/?p=19439 As we say goodbye to 2023 and look forward to 2024, here are some thoughts about the current state of data journalism.

The post Four Reasons to be Optimistic about Data Journalism in 2024 appeared first on Nightingale.

]]>
According to findings from the State of Data Journalism, a 2022 survey from the European Journalism Centre, almost 40 percent of data journalists surveyed “became involved in data journalism as a result of the pandemic.” In fact, one of my favorite pieces of data journalism was published in April 2020. It was a rare piece of data journalism at the time—short and simple. Published in The Washington Post, which had generously taken down its paywall for COVID-related news, it was a much needed article because it didn’t try to tackle  huge data sets, or enormous numbers, or statistics that seemed outdated the second they hit the screen. Instead, The Washington Post, in this particular case, put forth an article about jobs lost on one block in Washington D.C. The piece wasn’t flashy or interactive, but showed the city block using illustrated, photo, and text formats in such an intimate way that the financial fallout of the pandemic became not only clear to the reader, but personal as well.

Here we are now, almost four years later, and data journalism is still going strong. While there are challenges ahead, we’ve certainly learned a few things since 2020. As we say goodbye to 2023 and look forward to 2024, here are some thoughts about the current state of data journalism.

Data journalism thrives in a multimedia, longform format.

These deeply investigated pieces are often collaborative efforts that take weeks if not months to complete. Wired’s “Inside the Suspicion Machine,” recognized by the Global Investigative Journalism Network (GIJN) as one of the best data journalism projects of 2023, was created in partnership with Lighthouse Reports and lists five authors/researchers and more than a dozen total contributors. In addition to sharp writing, the piece includes a range of interactive graphics, data visualizations, flow charts, and illustrations. It also incorporates another popular format: scrollytelling.

Many award-winning data journalism pieces follow similar formats; they are longform investigative pieces with vibrant photographs, compelling videos, and colorful illustrations alongside visually engaging data representations that make complicated material easier to understand. Most deal with serious subjects, but GIJN also honored a series of data articles focusing on Taylor Swift, including Reuters’s “The Unstoppable Pop of Taylor Swift.” While perhaps not technically longform, the piece includes many creative and interactive data vizes, looks at metrics such as danceability, and has enough cat photos to thrill most Swifties.

A graphic that shows how to read and interact with the charts in the Taylor Swift article.
Source: Reuters “The Unstoppable Pop of Taylor Swift

Data journalism is thriving, but what about data viz?

While the articles referenced above include not only lots of visual interest but also some compelling visual representations of data, other pieces classified as data journalism raise an interesting question: Do data visualizations need to be a major part of a data journalism article? It would seem that data viz would be an important part of any data-driven piece, but it isn’t always. 

Take, for example, ProPublica’s stunning series that examines connections between deadly pandemics and deforestation. The three-part series won a University of Florida Award for Investigative Data Journalism, but only one of the three parts includes multiple data visualizations. The main article of the series, “On the Edge,” includes smart writing, numerous photos, pullout quotes, and other engaging graphics but only two small maps that could be considered data visualizations.

Another example is a four-part investigative series from the Marshall Project that looks at abuses in New York prisons. This important and sobering series, which was a finalist for a University of Florida Award for Investigative Data Journalism, doesn’t include a lot of data viz. “In New York, Guards Who Brutalize Prisoners Rarely Get Fired,” the main article in the series, opens with an elaborate and creative data visualization meant to show the number of disciplinary cases against prison staff, the number of attempted firings, and the number of actual firings. It’s an impressive graphic (although most people may need to read the accompanying text if they want specific numbers), but it’s also the only data visualization in the entire series.

For some readers, it may not matter whether the data are in the text or in visual format, but for those who prefer lots of data viz in data journalism, there’s reason to be optimistic. New tools are making it easier to create data visualizations, and more colleges are offering data journalism degrees that include specific coursework related to data visualization. New platforms and more data-related college degrees may also offer exciting opportunities for independently published work, such as Jessica Carr’s “Why Does Vogue Hate Text,” honored by The Pudding as one of the best visual articles of 2023.

While longform pieces often steal the spotlight, shorter data journalism articles exist and deserve more recognition.

Multimedia longform pieces often look at important subjects including war and other conflicts, racism, and economic issues, but they take a long time to create, are expensive to produce, and often require multiple contributors. Plus — and more on this below — they tend to appeal to older audiences. No matter how impressive these pieces are, shorter (and quicker-turn) data journalism articles are important as well. Pew Research Center’s Fact Tank short-form collection is one such example.

Fact Tank’s articles tend to be focused, include at least one chart or graph, and range from approximately 400 to 2,000 words. Consider Pew’s rationale for this real-time platform: “In today’s fast-moving world, it is more important than ever for data to provide context for the policy issues and major news events that have become part of the national conversation. To provide background that is both reliable and timely, Fact Tank draws on Pew Research Center’s own data as well as other reputable data sources on the topics of politics, religion, science, technology, media, economics, global trends, Hispanics and social trends.”

The 2021 and 2022 State of Journalism surveys (administered by the European Journalism Centre) found that most data journalism articles took at least a week and sometimes up to several months to complete. However, data from these same surveys suggest that journalists may be starting to recognize the importance of shorter forms of data journalism. Data from 2023 are still being collected; however, researchers compared findings from the 2021 and 2022 surveys and noted a small shift toward “quicker-produced stories.”

Data journalism has enjoyed increased popularity in recent years, but there are still some untapped opportunities for growth, particularly in the areas of social media.

To understand the importance of social media and data journalism, consider where people, particularly younger demographics, find their news. According to research from McKinsey & Company, approximately 50 percent of Gen Z-ers see news on social media, and while the number of Gen Z-ers who find news via TikTok is still relatively small, that number is growing. Exact numbers and percentages vary depending on the survey and geographical scope, but a July 2023 report from Reuters includes similar findings: “…younger groups everywhere are showing a weaker connection with news brands’ own websites and apps than previous cohorts – preferring to access news via side-door routes such as social media, search, or mobile aggregators.” While some publishers are trying to find ways to direct younger audiences back to more traditional platforms, many industry experts and academics believe media outlets should stop expecting audiences to grow into traditional platforms and instead should bring stories to the platforms these audiences already engage with.  

Sumi Aggarwal, chief strategy officer at The Intercept, reminds writers that social media isn’t a death knell for journalism and suggests looking for opportunities and experimenting with new forms: “…in today’s noisy and crowded information ecosystem, we have to work to make sure the public finds our work. That means we must reach them where they are and in ways that appeal to them.” She continues, “We must accept that the beautifully written 10,000-word piece will only reach certain kinds of audiences — those most willing to sit at a desktop and take the time necessary to read it. Those are not stories that are meant for mobile or young news consumers. The audiences for those prestige pieces inherently skew older, more affluent and let’s face it, traditional white and North American readers.” Aggarwal is speaking of investigative journalism in general, but her points certainly apply to data journalism as well.

While many media outlets have strong social presences, few are dedicated solely to data journalism and searches for data journalism on TikTok and Instagram bring up limited results. One exception is Mona Chalabi, who created the 2023 Pulitzer Prize winning article “9 Ways to Imagine Jeff Bezos’ Wealth” (best viewed here if you don’t subscribe to The New York Times). Chalabi’s Instagram feed has over 480,000 followers with her most popular posts having close to 80,000 likes and over 500 comments.

Other examples might not have quite the following that Chalabi does but still point to a bright future for data journalism. The Pudding, an award-winning online publication determined to make data fun and bring us stories we didn’t know we needed, also has an active Instagram presence. Plus, the publication is dedicated to experimenting with data-driven storytelling and gambling on weird subjects—my personal favorite might be their analysis of pockets.

Final thoughts: Data journalism, like all forms of journalism, will continue to have its challenges and struggles. That said, given the creativity and depth of research found in so much data journalism today, publications and people willing to experiment with form, subject, and platform, and tools making it easier for data journalists to create visuals and publish their work, there’s a lot of room for optimism.

CategoriesData Journalism

The post Four Reasons to be Optimistic about Data Journalism in 2024 appeared first on Nightingale.

]]>
19439
How a DVS Mentorship Changed My Approach to Data Journalism https://nightingaledvs.com/dvs-mentorship-data-journalism/ Wed, 01 Nov 2023 17:25:07 +0000 https://dvsnightingstg.wpenginepowered.com/?p=18976 Over the course of the mentorship, I iterated on a data visualization project that bridged my two passions: data science and journalism.

The post How a DVS Mentorship Changed My Approach to Data Journalism appeared first on Nightingale.

]]>
As an editor of Reed College’s student newspaper, The Reed College Quest, my days look a lot like those of student journalists across the country. I interview professors, walk and talk with concerned students between classes, and send lots of emails to college administrators — many of which go unanswered. I’ve even done my fair share of sprinting across campus to cover unfolding protests, or calling lawyers from the nonprofit Student Press Law Center to figure out if my latest scoop is even legal to publish.

But my days also include a lot of data science work that would be unfamiliar to many student journalists. In the spring of 2023, I was the first in the newsroom to discover a second tab in an Excel file accidentally released by the college — named with the file extension “Exempt Ranges – Hidden” — which became the backbone of our explanatory coverage of staff protests against Reed’s proposed changes to employee compensation. More recently, I led a series of investigative stories which brought to light a database vulnerability that had exposed the campus ID numbers of thousands of students, faculty, staff, and alumni — one that had long been known by IT but gone unfixed for months. 

I like to say that most of my best stories have been found in the developer console, not in a reporter’s notebook, and these days my first instinct when chasing a scoop is to open a new RStudio environment and start collecting data.

Yet that can be a difficult tightrope to walk. I think of myself as a writer, but my education is in computer science, and — to ask most of my peers in either discipline — the two could not be more different. For much of my life, it was difficult for me to envision a path that would allow me to explore my passion for writing and my skill with data without sacrificing one for the other. 

Nowhere has that tension been more apparent than in my work at the Quest. As a student journalist, I’ve been trained to always write in a way that’s accessible to the reader. But as a student of computer and data science, I have experience in data analysis that most of my readers simply don’t. 

For me, data visualization can be a bridge between the worlds of data science and journalism: a way to weave hard-won insight and reporting into otherwise esoteric facts and figures.

Yet I’ve known for a long time that the balance necessary for such data-driven reporting is difficult to learn, and my work in student publications can only go so far in preparing me for the rigors of an investigative data journalism career. So, in the summer of 2023, I turned to the Data Visualization Society to improve my education.

***

The DVS summer mentorship program, which matches students with experienced professionals, was my dream come true. My mentor Julia Wolfe, Americas Graphics Editor at Reuters, was an expert in exactly the kind of data-driven reporting I hope to pursue, and I will always be grateful to her for taking the time to advise a student-journalist like me.

Throughout our ten weeks together, Julia and I collaborated on a project I’d been envisioning for months. I’ve always considered myself a writer and lover of languages first and foremost — I plan to minor in Spanish and Latin American literature, and I’ve often approached my computer science coursework in Python, C, and other programming languages just as I would a foreign language class (an approach aided by the fact that, at Reed, Introductory CS carries a foreign language credit). 

I’m fascinated by the data of language: the structures and patterns we use to express complex, abstract ideas in formal writing. I wanted to find a way to map rhetoric, to turn words on a page into numbers and then back into art, to see the why and how of speech laid bare and in vivid color. 

In retrospect, I could have chosen any kind of speech to study. But I chose political speech, mostly because of the upcoming presidential election. Using a Reuters dataset provided by Julia, I began studying presidential campaign speeches in light of the 13 key issues that American voters ranked as their highest priorities in recent weeks. My early versions — built using Flourish — were clear but blunt, conveying little more than the number of times certain words, which I thematically grouped — were mentioned in each candidate’s speech.

My approach was based on simple word groupings: if a candidate mentioned the words “jobs,” “companies,” or “inflation,” it counted toward a mention of the economy; “border,” “immigrants,” or “aliens,” toward immigration, and so on. I wanted to make that strategy more clear, to be more transparent in my design and give the reader more of an opportunity to see the judgment calls that went into deciding which words fell under each issue. My next version incorporated those ideas by visually grouping the words together.

But this prototype felt messy to me. It’s a data scientist’s chart, not a journalist’s: it makes the maximum amount of data visible, but doesn’t have the rhetorical structure necessary to make that data clear or informative to a general audience.

After several more attempts, I thought I had found the perfect prototype. Rather than try to pack more information into a single layout, I decided to parse the information into three axes: position, color, and size. That way I could encode three variables — party alignment (color), candidate ranking (position), and voter ranking (size) — without sacrificing a minimalist layout or expanding into multiple charts. Finally, I was satisfied with my work. I zipped a folder of prototypes, sent it off to Julia in preparation for our next meeting, and closed my laptop for the night.

Respondents to a recent Reuters poll identified 13 key issues that concern them, but their rankings were not always in line with those of their candidates. Above, issues are ranked by their emphasis in candidates’ kickoff speeches for the 2024 campaign, but sized by the priority given to them by Democratic and Republican voters.
Respondents to a recent Reuters poll identified 13 key issues that concern them, but their rankings were not always in line with those of their candidates. Issues are ranked by their emphasis in candidates’ kickoff speeches for the 2024 campaign, but sized by the priority given to them by Democratic and Republican voters.

Then, disaster struck. 

As an aspiring data journalist, I consider myself a dedicated follower of The Washington Post’s Department of Data. When I idly pulled up the Post’s homepage that Monday morning, I saw something that instantly set my heart racing: a new column titled, “The Words GOP Presidential Hopefuls Use To Stand Out In a Crowded Field.”

My heart in my throat, I clicked.

And there they were. Page after page of carefully crafted bubble charts, sized by word frequency — more beautiful, more carefully crafted than my own, but in intent and structure almost identical to some of my earlier prototypes. My idea, it seemed, was not that original after all, and The Washington Post had beaten me to it.

A minimalist packed circles chart visualizing the words used by presidential candidate Donald Trump in his November 22 campaign kickoff speech. Words are represented as gray circles sized by their frequency, with some key terms like “great” and “Biden” highlighted in yellow.
The Washington Post’s Department of Data published visualizations of political rhetoric similar to ones I had considered.

Had I given up then — quit in a moment of dejection — my project would have been over. And I’ll admit, I considered it. I didn’t want to be seen as simply copying The Post’s work, and it would be hard to explain that I really had coincidentally developed a very similar piece at around the same time. 

Julia, luckily, talked me out of abandoning it. She reassured me that it isn’t that unusual for multiple journalists to pursue similar pitches around the same time, especially when it comes to significant topics like the presidential race. 

To my surprise, however, she offered significant constructive criticism of the final prototype I had become so fond of. Size, she said, was frowned upon as an axis for important information, since relative sizes can be very difficult for the human eye to compare. That essentially gutted the layout of my final chart.

At first I dug my heels in. For lack of any better reasoning, I just liked my final draft. I found it visually pleasing, and the data scientist in me liked the idea of using size, color, and position to encode information in unexpected ways. 

But then I realized something. In designing my charts to maximize the amount of information presented across every axis, I was thinking in the terms of computer science, where encoding information in an efficient and elegant way is key. But there were other kinds of efficiency and elegance to consider. Efficiency of communication, for one, and the elegance of explanations that bring clarity and understanding to the reader. To tell an effective data-driven story, I needed to put aside my personal pride and write to the reader — to prioritize efficiency of communication over efficiency of design.

“But then I realized something. In designing my charts to maximize the amount of information presented across every axis, I was thinking in the terms of computer science, where encoding information in an efficient and elegant way is key. But there were other kinds of efficiency and elegance to consider.”

I went back to the drawing board. If I wanted to design a more approachable and more accessible chart, I’d have to do it from the ground up. I asked myself what my lede (or topline finding) was, what information I wanted to communicate most clearly, and found — to my surprise — that it wasn’t the word rankings themselves but how the issues ranked among the Republican candidate, Republican voters, Democratic candidate, and Democratic voters. Republican voters, for example, placed less emphasis on environmental issues than Democratic voters, and Democratic voters placed less emphasis on it than Joe Biden did in his speeches. I found those differences, not only between voters, but also between voters and their leading candidates, particularly compelling. 

If that was my focus, then the guiding flow of the chart should be the issue, not the party or the candidate. I looked through a folder of old Five Thirty Eight charts Julia had shown me, and found one that fit the bill: an analysis of 2016 Eurovision results that used solid lines to connect single countries between the new and old ranking systems.

A Five Thirty Eight chart depicting differences in possible outcomes for the 2016 Eurovision competition under two different ranking systems. Two sets of rankings are shown side by side, with dark lines connecting each contestant to its equivalent under the other system.
Julia showed me a Five Thirty Eight chart that inspired my next design.

It was a good design, and I liked the idea of representing key issues as single lines that the eye could follow from one side of the chart to the other. I’d simply have to expand it to include four rankings (to account for A&B and X&Y) instead of two. That was doable. I opened my laptop and started sketching late into the night.

From there most of the remaining challenges came from a design perspective. Julia helped me work through several aesthetic issues to craft a clearer and subtler design which ultimately dropped the use of color. Meanwhile, I set myself the challenge of building the final chart as a full HTML document with JavaScript interactivity. It also gave me a good reason to teach myself JavaScript for web development — something I’d been meaning to pursue for a while. 

Finally the day came. My final draft was ready.

In my final visualization, both candidates and both groups of voters are given a column dedicated to the emphasis they give to the thirteen key issues. Individual issues are connected across the chart by dark lines that change as the user hovers. At right, each issue is given its own caption that appears as its issue is hovered.
My final draft made use of interactivity to emphasize only one issue at a time for the sake of clarity. I chose the environment as the default focus because I found its vast swings between groups interesting.

The final chart bore almost no resemblance to my original bar chart of associations. The rankings remained, but any sense of magnitude was gone. Instead, the JavaScript running in the background welcomed the reader into the story by highlighting a single dedicated issue. From there, the reader could hover over any issue to highlight its path between interest groups, and my annotations of context for each issue would appear in the sidebar. 

The design logic of this final piece had also changed significantly since I began the project. Gone were the bold colors, the axis lines, and the attempts to cram layers of significance and implication into every available axis of meaning. My final piece is, in a way, less informative than my early packed circles charts. But it is also more clear, and, as a result, the reader learns more from it.

And that, I think, is the real lesson I took from my DVS mentorship experience. For years, I felt myself caught between two worlds: unable to reconcile my abilities in computer science and my passion for writing. Now I see, for the first time, that they are not separate skills, but one. 

“For years, I felt myself caught between two worlds: unable to reconcile my abilities in computer science and my passion for writing. Now I see, for the first time, that they are not separate skills, but one.”

When George Orwell laid out his famous rules of writing — “Never use a long word where a short one will do,” “If it is possible to cut a word out, always cut it out,” etc — he could have been just as easily describing the principles of efficient computation as the principles of good rhetoric. 

Principles of good design and good data science are not unlike the principles of good writing: all boil down to simplicity, elegance, and clarity. Only once I realized that was I able to see past the trappings of more complex charts and visualize the data in a way that would make the insight, and the story, accessible to the reader. And now, finally, I can picture a way forward: a way to be a computer scientist and a writer, and a way to tell stories with data that leverage the skills of both disciplines. That was the real value of my DVS mentorship experience, and I have a feeling it will stay with me for a long time.

The post How a DVS Mentorship Changed My Approach to Data Journalism appeared first on Nightingale.

]]>
18976
Infographics Journalist Charles Apple on Making Effective Charts https://nightingaledvs.com/charles-apple-on-making-effective-charts/ Thu, 10 Aug 2023 13:24:48 +0000 https://dvsnightingstg.wpenginepowered.com/?p=18199 Charles Apple, the "Further Review" page editor of The Spokesman-Review of Spokane, Washington, talks about graphics reporting.

The post Infographics Journalist Charles Apple on Making Effective Charts appeared first on Nightingale.

]]>
Charles Apple, the “Further Review” page editor of The Spokesman-Review of Spokane, Washington, talks about bar charts and other data visualizations in an exclusive chat with tksajeev.


What are the benefits of using bar charts in data visualization?

I consider bar charts one of the most useful tools in creating rapid-read data visualizations or “alternative story forms ”(ASF).

That’s my job these days. I build ASF that we’ve branded “Further Review” pages. My paper runs four times a week, plus we distributed the pages to other papers around the U.S.

Most of my pages try to explain a lot of material. Bar charts are extremely helpful in my work — they’re fairly quick to build but they’re very VERY quick to read. Anything that speeds the reader’s comprehensive of the data or of my topic is welcome.

When should bar charts be used? 

Different types of data require different types of charts.

If you’re trying to show parts or percentage of a whole, then you may want a pie chart. If you’re showing one number fluctuating over time — think fuel or food prices or temperature throughout a hot day — then you probably want a fever graph.

But if you’re showing and comparing quantities, then bar charts are the way to go.

In this example, we wanted to compare fuel prices heading into the American Memorial Day holiday weekend, over the past 10 years and then the year we were in: 2017. 

A bar chart about Houston area gas prices heading into the Memorial Day weekend. The chart is 2007 to 2017 and the price for 2017 is $2.39. That bar is a darker shade than the others because it's the lowest of all the years.

A bar chart did the job nicely. Note how we highlighted the current year by using a slightly darker color.

This also had the advantage of being a small graphic. Which meant it was easy for our page designer to work it into their page.

Now, here’s a sample from sports, showing wins and losses for each month over six months of a baseball season.

A bar chart called "A season to remember, how the Asstros' 101-61 record—the second-best in franchise history—was built month by month.It show the wins and loses from April to October, with a column for percent win and a column for games ahead in AL West.

Notice that there are actually TWO bar charts here: One for wins and one for losses. I’ve tucked them tail-to-tail, making it clear with color and labelling which is which. I’ve found this to be an effective technique, even when I include a LOT more data — in this case, going back 130 seasons for the football team at the University of Georgia.

A very large horizontal graphic called "A History of Georgia Football" which has every season from 1892 to 2021, bars showing wins and loses and many callout captions to point out significant events that happened in certain years.

Wins are up top. Ties (silver) and losses (black) are below. And I noted significant events in the history of Georgia’s football program across the top, with tiny little arrows pointing to the appropriate years.

Across the bottom — carefully lined up with the year-by-year bars, is data showing how Georgia did each year against it’s three primary opponents: Auburn, Florida and Georgia Tech.

Much more information made for a much more complicated presentation. But if you’re a Georgia football fan, this might be worth keeping to enjoy time and time again.

What are some common mistakes to avoid when creating bar charts? 

I like to tell people that bar charts are difficult to screw up. Unless you do something obviously wrong, of course, like mislabel them. Or use them for a type of data that would be better served by another type of chart.

One common mistake is when chart designers don’t use “zero” as the baseline of their chart. This can make the data seem different from what it really is — or different from what’s being described in the text.

This one from Fox News is fairly typical of deceptive charts. It shows how many people had enrolled in health insurance under “Obamacare,” compared to the goal for just four days later. 

Obamacare enrollment chart with two bars: 6 million people on March 27 and 7 million people needed for the March 31 goal. However, the y-axis has no labels, so the 6 million bar looks very small and the 7 million bar looks very big.

Looks like they have a long way to go, right? The Obamacare officials are unlikely, perhaps, to make their goal?

But take a moment and look closer. Where is “zero” on that chart? The designer didn’t show you. And Fox News was famous for slanting their news reports to make certain politicians — including Barack Obama — look bad.

What happens if we rechart that same data and use a “zero” baseline? 

A revise of the Obamacare chart where the y-axis is at zero. The 6 million now looks much closer to the 7 million.

Well! Now, the data looks entirely different, doesn’t it?

But that’s not the story Fox News wanted you to believe. So they didn’t chart it accurately for you.

Here’s an example from a large American sports magazine, showing the number of baseball players who hit a certain number of home runs per year.

This is a bar chart called "Thirty Something." It is a bar chart of players 25 of younger to hit 30 home runs.

It’s a very nice little chart… until you begin looking at it closely.  See the “2” bars, just left of centre? They’re nearly the same height as the “4” bar. But isn’t “2” half of “4”? Those bars should be half the size as the “4” bar, right? What are we missing here?

What we’re “missing” is that the designer drew a chart but then decided to add photos of baseball players. In order to make it clear which player goes with which bar, the designer simply lengthened the bottoms of the bars to include photos.

Cute. But this changed the content of the chart. It no longer accurately shows us data. The designer here might have been better off simply building a list or something, with photos tossed in.

The same chart of young ball players, but this time the bottom of the bars is set to a zero line so that the two is half of four rather than looking very close to four.

Another problem with bar charts, however, is when designers get bored with “old-fashioned” bars and try to show that same data with circles or “bubbles.” Here’s an example from a large newspaper, several years ago, showing who were, at the time, the 10 highest-paid players in Major League Baseball. Notice that the sizes of those circles are very difficult to compare to one another. 

A bubble chart showing the 10 most highly paid players. All the bubbles look about the same size, but the numbers below the bubbles indicate a range of 21.9 million to 29 million.

This data would have been instantly read and understood if the designer had used simple bars. I blogged about stuff like this extensively at the time. 

In this next one — from that same newspaper — it’s easier to see the difference in sizes of some of the “bubbles.” But not all of them. Again, if this data is worth charting, it’s worth charting in a form that readers can see and understand quickly. And this isn’t it.

A chart of the deadliest drugs, where number of overdose deaths is in bubbles, ranging from Hydrocodone at 362 to Clonazepam at 119.

Now THIS next one has a slightly different problem. This also ran in a large newspaper, about a decade or so ago. You can instantly see how much, much larger that one big blue bubble is compared to the others. It makes a big point about the enormous cost of putting on the 2008 Summer Olympics.

Cost of hosting the Olympics. Beijing was $40 billion, which is the biggest bubble compared with London, $18 billion, Athens, $15 billion, Sydney, $3.8 billion and Atlanta, $1.8 billion. The size of the Atlanta bubble is tiny and the Beijing bubble is enormous.

But here’s the BIG mistake the designer made: He didn’t calculate his “bubbles” correctly. I went back and recharted that same data, building the “bubbles” properly. 

A side-by-side of the bubbles chart, with Beijing and Atlanta looking more reasonably relative in size.

You can still see that the Beijing number is larger than the others. But the difference in size isn’t nearly so impressive.

The REALLY funny thing here is that if the designer had opted for a good, old-fashioned bar chart, that Beijing number would have popped out at the reader in a much more impressive way. 

A side-by-size of the original bubble chart and a bar chart.

See? Much better! 

But there were a lot of designers out there who complained that bar charts were old-fashioned and boring. I would argue they’re easier to read, easier to understand and harder to mess up.

What are some tips for creating effective bar charts?

Sometimes, it’s not the chart itself that can make or break a presentation of data. It’s the labels on that chart. 

By now, your readers may be wondering why I build so many of my bar charts sideways. It’s because I like to include the value on each bar when I can. And those numerals typically fit better when the bars are sideways. 

Take this page, for example. It’s about the 1948 Presidential election.

A page of the "Further Review" page featuring "Dewey Defeats Truman" photo. Under the photo is an analysis of the polling results that were happening at the time, editorial cartoons and headlines.

That election was especially known because the Chicago Tribune printed a front page prematurely declaring the Republican challenger had defeated Democratic incumbent President Harry Truman. In fact, Truman won the election.

See the little charts along the left side of the page? Those are showing electoral vote totals of the previous four elections. Those three-digit numbers wouldn’t fit comfortably atop those narrow little bars had the bars been turned vertically.

The chart shows Roosevelt winning many more electoral votes than each of the republican candidates in each of the 1932, 1936, 1940 and 1944 elections. The bars run horizontally.

The same applies to the second chart, further down that page.

Horizontal bars again, showing respondents who said they'd vote republican

But look at the third chart, which shows three poll results — suggesting Dewey might defeat Truman — and then the actual results of the election. 

Bar chart showing different polls showing Truman vs. Dewey outcomes. The bars run horizontally.

Each bar consists of three pieces — a blue segment showing the percentage of votes for Harry Truman, a red segment showing the percentage of votes for Thomas Dewey and then an “others” category in the center that I colored a neutral brown/grey.

Each adds up to 100%, so all four bars are the same total length. They’re just split up differently. We call this a “stacked bar chart.” This can be a very helpful tool… when used properly.

Now, here’s one where each midterm U.S. election is represented by two sets of data: One for our House of Representatives and one for the U.S. Senate. And then each of THOSE is represented by two bars — one for each of our major political parties. The red bars show Republicans and the blue bars show Democrats.

A graphic for the "Further Review" page that shows the split control of the House and Senate from the 56th congress in 1899 to the 117th congress in 2021.

The point is to show how control of each of our two legislative bodies has swung back-and-forth over the years. The bars show how many members of each party were in each house during the session shown. And to make sure it’s clear which party was in control, I made the bar for the party with greater numbers brighter and dimmed the other party.

You can see right away that Democrats controlled the house for many, many years in the 1950s, 1960s and 1970s. But Republicans have been the dominant party since 1995 or so.

Now, there is actually a THIRD value shown in this giant chart: That of “other” or “vacant.” Typically, though, those numbers are so tiny that they hardly show up at all. One exception: The set of bars at the extreme upper right.

Also, note this chart shows every Congress going back to 1899. That’s a lot of data! When you pack this much material into a page, it’s important to use plenty of space around your elements to keep things from becoming too cluttered. A cluttered page will frighten off your readers.

What are some examples of well-designed bar charts?

This was one I was particularly proud of: Last year, I built a page on the 110th anniversary of the sinking of the Titanic.

A full-page graphic and text of the titanic stats including number of people on each lifeboat and the demographics of those lost at sea.

I had always heard that as the ship was sinking that night, many of its lifeboats were sent out not completely filled. This, of course, contributed to the large number of people who were killed when it sank.

What  I didn’t know, however — until I came across it as I was doing my research — was that there was actual data out there on exactly how many lifeboats had been aboard the Titanic and exactly how many people were loaded onto each one!

This meant that I could chart this data — data I had never seen before. Nothing makes me more excited than creating a chart for data I had never seen before. Here’s a closer look at that part:

A bar chart of the 20 lifeboats -- their number identifier, their launch time and the number of passengers, ranging from 12 people to 70 people. Each lifeboat had a 65 person capacity except two that were 40 people and four that were 47 people.

The blue bars show how many people were aboard each lifeboat. The grey part of the bar shows how many people CAN fit aboard each lifeboat. So this is actually another “stacked bar chart,” right?

You can see that two lifeboats were sent away with 70 people aboard — when they were supposed to hold only 65. But some of the others were put into the water with a great number of empty seats. Lifeboat No. 7 could have held another 46 people! Lifeboat A could have have held another 34! 

While we’re at it, let’s look at the smaller sets of bars at the bottom right of that page…

Bar chart of those lost at sea: 693 men crew lost vs. 192 saved; 387 third class men lost vs 75 saved; 154 second class men lost vs 14 saved; 118 first class men lost vs. 57 saved. Also, 3 women crew lost vs. 20 saved, 89 third class women lost vs. 76 saved; 13 second class women lost vs. 76 saved, 4 first class women lost vs. 140 saved. Finally, 52 third class children lost vs. 27 saved, 0 second class children lost vs. 24 saved and 1 first class child lost vs. 5 saved.

This one simply shows how many people were saved via those lifeboats vs. how many died that night. Notice that of the people who were saved, a lot of them had traveled first or second class. Third-class passengers were not taken care of very well that night.

Now, here’s one that’s fairly simple, I think. But, at first glance, APPEARS to be complicated. This page shows sales figures in the United States of various formats of recorded music.

A "Further Review" page on the compact disc. The page has several bar charts showing popularity of vinyl, 8-track tapes, cassette tapes, CDs, digital downloads and streaming over time.

There are six bar charts here. But a) each is drawn to the same exact scale, and b) each is carefully positioned so the years all line up, from chart to chart, through the length of the page. This means readers can compare numbers for each year.

The first batch of grey/black bars show the decline of vinyl record sales in the 1980s to very little by the 1990s. But the blue bars show cassette tape sales increasing throughout the 1980s, peaking in 1990 and then tapering off by the 2000s.

But it’s the red bars that I really wanted to jump out at readers: That was the topic of my story: sales of compact discs. Not only can you see them increasing from 1983 and peaking in 2000, but you can tell from the height of those bars how much greater CD sales were than either tapes or vinyl!

The purple charts at the bottom right show digital downloads and streaming. The numbers are nowhere near what CD numbers had been 20 years ago. Which tells us that record companies are not making anywhere close to the revenues they had been making.

And now for another one that’s rather complex, but I feel like our presentation made the story fairly easy to understand…

Every once in a while, the U.S. Post Office increases the cost of mailing a letter. They raised it again just a few weeks ago, in fact.

I built this page when they raised the price last year. The actual cost of a stamp is shown by a fever chart: The blue line you see running down the left side of the page. For many, many years, the price of a stamp was under 10 cents.

A "Further Review" page showing the real cost of stamps over time since the mid-1800s. It shows the actual and the adjusted prices on along horizontal bar chart.

What I wanted to do here, however, was to show both the actual price of postage — going back to 1863, which was nearly 160 years ago — and the price of that postage when adjusted for inflation. Only then would readers truly understand how much they’re paying.

So while the blue line shows actual prices, the grey/blue bars behind them show the price adjusted for each year. Yes, this meant I had to go to the web site I use to calculate inflation and come up with a number for each of the 159 years I showed here.

The result shows that the price of postage, when adjusted for inflation, really hasn’t gone up much in 136 years! Which is not something the average reader is aware of.

Note the five little sets of bars over on the bottom right. Those were meant to show this particular increase — the one that fell into place in July 2022.

Prices of various parcels, from a first-class letter less than an ounce to a first-class letter for each additional ounce, to a post card to an overseas letter.

What is the main message that you want to communicate with your bar chart?

Bar charts allow for easy comparison of quantity. So if you want to show the amount of something for this month compared to each of the past few months: That’s a bar chart. If you want to show the amount of something for your nation or your area compared to other nations or areas: That’s a bar chart.

I especially like them because they’re so easy to understand and don’t require a reader to spend an awful lot of time with them. So they’re great to support a main point I might be making in the text. Or the chart itself can be a nice sidebar to the main part of a story.

What are some design and text strategies that you use to highlight key data points?

Earlier, I showed you a bar chart that used a darker color on the current year’s data. That works well sometimes.

Bar Chart of minimum wage around the US, where each bar is a state and the values are the minimum wage.

This one does the same thing, except it also adds a black box to show how many U.S. states had the same level of minimum wage as our state — which was Texas. 

This one is actually a table — rows of numbers — in which we’ve taken the most important set of numbers and charted just those.

Table chart of football participation at Des Moines city schools compares poorly to the rest of the Central Iowa Metropolitan League. Each row is a place. The last column has a horizontal bar chart showing participation percent.

But again, note that I highlighted five of the 18 bars. Those were schools in our metro area. The point we were trying to make: Participation in playing American football at our area schools was lower than that of other large schools around the state.

A tip: Use your introductory copy at the top of the chart to make it clear what you are hoping the reader will learn from your chart. And then make sure the chart shows them the data to bake up that statement.

And in this example from 2016, we were trying to show how candidates from the two major U.S. political parties were progressing in winning the nomination for president. 

Horizontal Bar charts of Delegates for republicans and democrats for the US election. The republicans needed 1,237 for nomination (and Trump had the lead with 743) while democrats needed 2,383 delegates for nomination and Hillary Clinton had 1,758 (including superdelegates).

I color-coded the charts with the colors the two parties use. And I made the leading candidates a darker color than their opponents of the same party.

But in each case, I wanted to make it clear how many delegates each candidate had earned and how much further to go they had to be nominated. So the little dotted line on the end of the chart was just as important at the bars themselves.

What is the target audience for your infographics?

I work in print, so it’s newspaper readers I’m trying to reach with my work.

For several years, my organization gave up trying to post digital versions of my pages. Which makes sense, I suppose: Especially when I use VERY large charts, building a version that can be easily read on a laptop computer as well as a smart phone can be extremely difficult.

Over the past few weeks, however, we’ve started up again. My newspaper is the home of the only paid high school internship program in the United States. They assigned one of those interns to build digital versions of my full page presentations.

We’re still relearning everything. But this digital version of one of my pages contains two bar charts. See what you think.

And this one has a collection of VERY simple bar charts, down near the end.

What is the overall tone and style of your infographics? 

With my full-page presentations, I’m generally trying to explain a complex topic but in a way that makes it easy for the reader to understand. I often work with historical anniversaries or science topics. Not just what happened, but also why it’s important to remember or to know about.

Some of these pages contain bar charts. Some contain other types of visualizations. And some are just text and pictures.

Here’s one from three years ago, about the settlers who sailed to what is now the U.S. back in 1620. We call them “the Pilgrims.” But what I didn’t really know — until I built this page — is that there is data on everyone aboard the Mayflower ship: Men, women, and children; ages and who lived and who died. 

A full-page infographic of the Pilgrims who set out for America in 1620. Icons of people show who died en route or during the first winter.

It’s stunning to see the huge percentage of them who didn’t make it through their first winter in North America. I had never seen this charted out before. I take particular joy in creating a visual out of data I’ve never seen before.

Here’s another one — also from several years ago — that shows the D-Day invasion of June 6, 1944 and includes a visualization of how many men came ashore in Normandy and how many were killed.

A full-page infographic showing the various divisions that landed on the D-Day beaches. A grid shows the share of troops vs. casualties for each division.

Some days, I use no data visualization at all. Some days, it’s just diagrams and text…

A full-page graphic showing a diagram of the Saturn V, with annotations showing its various parts. Also, diagrams of the lunary orbit rendezvous and lots of text descriptions of the Apollo 11 people and event.

… or just photos and text. This one ran after the horrific riot and attack on the U.S. Capitol on Jan. 6, 2021, listing eight previous times the Capitol had come under attack — or where various people had come under attack.

A descriptive timeline of eight previous attacks in or on the U.S. Capitol. Each event includes a date, a headline, a photo and a description.

And on occasion, I don’t mind having a little fun with a page. This bar chart shows the run times of all the Marvel super hero movies, compared to the one that was opening in theaters that day: “Avengers Endgame.”

A bar chart of Marvel movies by length.

I also compared the run time to that of four other famously lengthy movies.

Why did we chart that? Because people were already complaining that the movie was so long, it was difficult to find time to sneak out of the theater to find a restroom!

A view of the full graphic (which includes the movie duration bar chart) and also a list of tips to avoid having to use the bathroom during a movie, including avoiding diuretics, irritants, and certain foods.

That was one of the first pages I ever did for the Spokesman-Review. 

Like I said: Sometimes, my chart is a bar chart and sometimes it’s another type of chart. Sometimes, instead of a chart, I’ll build a timeline. Or tell the story with small chunks of text — think of them as mini-stories, or story chunks.

I’m honored when someone refers to me as a data journalist. I see and admire the cutting-edge work that data journalists are doing these days.

But I’d argue I’m just an old graphics guy trying to stretch his career just a few more years by telling different stories in different ways. And I’m lucky enough to find a newspaper that will give me a home where I can do that.

CategoriesData Journalism

The post Infographics Journalist Charles Apple on Making Effective Charts appeared first on Nightingale.

]]>
18199
Creating a Design System to Prevent Problematic Colour Pairings https://nightingaledvs.com/data-journalism-colour-accessibility/ Thu, 13 Jul 2023 17:39:51 +0000 https://dvsnightingstg.wpenginepowered.com/?p=17778 The data design system at Economist Impact needed to ensure charts were clear, insightful, and accessible—which led to a new colour tool. 

The post Creating a Design System to Prevent Problematic Colour Pairings appeared first on Nightingale.

]]>
“Even for print, I have questions about the subtlety in colour difference,” a member of our data journalism team admitted when the subject of accessibility was raised. “It’s like a Farrow & Ball swatch, which is nice in a Georgian home, but sometimes a little too subtle.” 

Data visualisation is something we at Economist Impact take very seriously. Our group has 150 years of experience publishing charts, yet we continuously question how we talk about and show data. We learn and adapt by critiquing work from ourselves and others as the data visualisation field continues to mature. 

Launched towards the end of 2021, Economist Impact is the latest addition to a family of businesses which form The Economist Group. Working with governments, NGOs, and international institutions, we use data visualisation to present our policy research and insights, while helping to drive change with unique digital storytelling experiences. The need for a dedicated data design system was obvious from the beginning. Our brand design toolkit was never intended to handle the complexities that data bring and that information designers face day in, day out. 

Everyone, it seemed, had their own take on what our new data design system should be and what it should do. Addressing inconsistency was a strong area of contention. Others wanted guidance to encourage more varied chart selections so that our innovative approach to research could be better reflected in how we show it. But above all else, we heard that we needed to ensure our charts are as clear, insightful, and accessible as possible. 

The first requirement of our charts is that they should be readily understandable. 

Page one of The Economist Style Guide offers the following advice: “Clarity of writing usually follows clarity of thought. So think what you want to say, then say it as simply as possible.” As an extension of our writing and how we deliver insight, our new data design system was shaped to help our charts follow that same path. 

We offer our designers a few tips to ensure their charts are as clear as possible. First, we encourage them to try writing the chart headline before plotting the data. Boiling down the primary message helps bring focus and clarity. Next, we suggest they consider the types of data relationships that best demonstrate the headline. The UK’s Office for National Statistics (ONS) lists eight common relationships that charts usually display: magnitude, time series, ranking, part-to-whole, deviation, distribution, correlation, and spatial. We’ve included simplified examples of each of those in our data design system to encourage our designers to consider all possibilities. Additionally, we ask our designers to think about what data can afford to be discarded. Showing too much can dilute the message, but removing too much may obscure the underlying context—we suggest using callouts or visual hierarchy to preserve detail without burdening the reader. 

All of this is fairly standard advice that should not surprise anyone familiar with data visualisation, yet its inclusion in our design system is vital to ensure we never lose sight of what we want to achieve with our charts. 

If accessibility standards cannot be met, we will lose business. 

Beyond that general advice, one of the main goals when producing our new data design system was to address concerns of accessibility. Few would dispute the importance of design inclusivity, particularly for an organisation like The Economist Group with a wide reach. Yet so often, inclusivity is overlooked when selecting chart colours. “This looks good; others must think the same” is a dangerously irresponsible attitude to take. At its core, data visualisation relies on encoding and decoding information in a way that our brains can effectively interpret. If the decoding fails, so too does the interpretation. Whether through ignorance or other failings, not taking account of a reader’s ability to process colour risks a communication breakdown. 

But accessibility standards and guidance are murky when it comes to data visualisation. Consider colour vision deficiency, a.k.a. colour blindness. There is not only one form of colour blindness, nor is there a one-size-fits-all solution. Also, this is not a rare disorder—some 8% of males have difficulty distinguishing red from green. When cultural conventions (in the West at least) state that desirable data be shown in green and less desirable in red, we have an accessibility problem. Even we at Economist Impact have succumbed to such crimes of colour in the past. 

Developing a data design system that meets accessibility standards also requires planning for the unexpected. Data are frequently messy and unpredictable. Designers demand flexibility and options, but more options mean more chances for things to go wrong. We advise designers to use as few colours as possible, but sometimes an expanded palette is needed. 

So, to help us rapidly test colour palette ideas against accessibility standards, we developed and prototyped a colour accessibility tool. “Colour accessibility” is a bit of a mouthful, hence this tool became affectionately known as “Cassy.” 

Cassy began as a simple set of grouped objects in Adobe Illustrator, where a single colour change immediately presented possible colour pairings to help us spot where brightness or hue were too similar. Filters from the colour blindness simulator Color Oracle helped us make further refinements to prevent problematic pairings. Cassy became an invaluable asset, not only for us to prototype and refine our own palettes before testing them with real charts, but also to assess and learn from palettes made by others who were clearly responding to the same challenge. 

A two-column chart showing color pairings on the left side and the same color pairings with a deuteranopia filter. The effect is that bright colors become muted, gray, and sometimes indecipherable from each other. There are five such examples, taken from PwC; The Guardian; BBC; and Carbon Design System by IBM.

Testing colour combinations from palettes we found during our research. Left shows normal vision; right shows Color Oracle’s deuteranopia filter. From top to bottom: PwC; The Guardian; BBC; Carbon Design System by IBM; and the system we developed at Economist Impact. 

As we realised her value, Cassy 2.0 took a leap into Google Sheets. Using colour hex codes, she was able to quantitatively measure relative luminance and contrast ratios for every pair of colours across a set. Web Content Accessibility Guidelines (WCAG) 2.1 standards call for a contrast ratio of at least 3:1 for “meaningful graphics.” Achieving this is a mathematical impossibility for any palette containing just a small handful of colours, let alone one as comprehensive as a data design system requires. This is why we needed to provide additional guidance for cases when colours in charts touch directly. Still, this knowledge helped us to refine our recommended categorical palette and the advised sequencing to maximise its effectiveness. 

Cassy 3.0 currently lives on the web as a tool that combines the best bits from her previous guises. While many contrast tools only measure the relative luminance between two colours at a time, Cassy checks every possible combination of 14 colours simultaneously, for a total of 91 pairings. Colours can be selected and edited using the bars at the top and left of the screen, while the central grid shows a visual representation of colour pairings. The contrast ratio is displayed in each instance, appearing in either a white box if it meets or exceeds the user-defined contrast threshold (higher values are better) or a black box if it falls below. 

A screen shot of Cassy, a tool developed to measure contrast ratios. The screen shot shows a grid of color pairings. In the corner of each pairing, there is a number indicating the ratio. An RGB slider tool allows the user to adjust the red, green and blue in each color.

Cassy is an experimental tool developed by Economist Impact to measure contrast ratios between colour pairs. 

Although she remains experimental and was not originally conceived as a public tool, anyone wanting to explore palettes can find Cassy online. For us, as useful as she is, the true value of Cassy came from our learnings about colour as we saw her develop. Accessibility isn’t something which can be measured through a single lens; contrast ratios are just one part of a more complex picture. 

To learn more, we also recommend the wonderful Viz Palette, a tool by Susie Lu and Elijah Meeks that flags conflicts for different types of colour blindness. We could not resist testing a variety of palettes, both our own and from others, with the BBC’s system proving a particularly tough example to top. 

Lastly, a mention on sequential and diverging scales. Thanks to chroma.js, these are a walk in the park. In our data design system we’ve specified colours at every stop for palettes featuring different numbers of classes. We have also clearly indicated if text labels should be black or white in every situation to ensure compliance. 

Rather than simply being a set of rules that constrain how we present our data, we hope our data design system encourages new ways of thinking within our business. For some members of our team, data is still secondary to the written word, but we are keen to pivot and allow visualisations to lead the conversation. And we are still only just beginning. Today, our data design system is a 100-page Google Slides deck. As it attracts more eyes and weaves its way deeper into our workflow, we see more opportunities for both it and ourselves to evolve. 

A real-life example of how the color tool can be used. In this example, the image shows, on the left, a sample color spectrum of pinks, reds and purples.  On the right, there's a heat map of the U.S. showing prison populations. The lighter pinks are in the northeast states. The darker reds and purples are in the southern states.

Providing designers with specific colours for a variety of situations helps to streamline the workflow, maintain consistency, and boost accessibility. 
The data design system allows for certain colors to pop out, showing the most important parts of the data. For instance, in a chart about music revenues by formats, cassettes, CDs, MP3s and streaming trendlines are in blue hues while vinyl is in red. the chart shows that Vinyl was popular in the 70s and 80s but started making a comeback in the late 2010s. The red against the blue helps the reader see this trend clearly.

Our data design system offers guidance to help highlight the most important aspects of the data. 

This article originally appeared in Issue 3 of Nightingale magazine. Get your copy here

The post Creating a Design System to Prevent Problematic Colour Pairings appeared first on Nightingale.

]]>
17778