Data Literacy Archives - Nightingale | Nightingale | Nightingale The Journal of the Data Visualization Society Fri, 24 Oct 2025 16:24:04 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/nightingaledvs.com/wp-content/uploads/2021/05/Group-33-1.png?fit=29%2C32&ssl=1 Data Literacy Archives - Nightingale | Nightingale | Nightingale 32 32 192620776 Exploring Data Detective Practices as a Class Activity https://nightingaledvs.com/exploring-data-detective-practices-as-a-class-activity/ Fri, 24 Oct 2025 16:24:00 +0000 https://dvsnightingstg.wpenginepowered.com/?p=24233 We reflect on our experiences arising from a recent computer science graduate class about data feminism, during which we explored the idea of being data..

The post Exploring Data Detective Practices as a Class Activity appeared first on Nightingale.

]]>
Figure 1. Example journey of Data Detective: beginning with defining a critical problem or question, identifying gaps (“What’s missing in the picture?”), searching for data, and confronting barriers—missing, partial, or deliberately obscured datasets. Each step provided unique insights into the relationship between data, society, and power dynamics. (Illustration by © Zezhong Wang & Ruishan Wu)

We reflect on our experiences arising from a recent computer science graduate class about data feminism, during which we explored the idea of being data detectives. In this report, we explain what we mean by Data Detective as an active approach where we, as individuals, could approach the underlying questions, as suggested by D’Ignazio and Klein “Data science by whom? Data science for whom? Data science with whose interests in mind?”. By connecting individually through personal reflection, data literacy, and critical engagement, our goal is to inform and inspire those who are interested in integrating similar methods into their classes.

As our society continues to evolve, more and more of the information we need is stored as data, and many of these repositories are growing and becoming what we refer to as Big Data. In this process, data becomes more challenging and less accessible to us as individuals. We, as visualization researchers, work on the creation of visualizations as at least part of the solution to this problem. However, much of our data is still not visualized, and even when it is, individuals still often find it challenging to understand. How do we cope with this? How do we teach our students to cope with this continually expanding problem? 

In our data feminism class, we introduced concepts such as visual variables, physicalizations, assumptions about knowledge development (e.g., Positivism and Interpretivism), along with reflection and discussion on reading the book Data feminism. We then explored developing an active practice through which we would document our investigations of both qualitative and quantitative data under various themes. We now term this active practice as being a Data Detective

Our concept of Data Detective is modeled on detective work in a more general sense, where a person uses coherent, time-based, record-keeping of their activities to gain a better understanding of that which they initially do not know but want to understand. Thus, to act as a Data Detective is to discover and conduct purposeful, documented, and reflective actions needed to gain access to the desired data. This detective work ideally results in access to the desired data, an understanding of the data, and the detective process involved. 

The term Data Detective appears in various contexts, making it important to clarify our specific approach. Unlike children’s books that suggest counting objects (like red cars versus white cars), or Harford’s statistical literacy guide with its ten rules for making sense of statistics, or visualization workshops for children by providing a gamified sense of accomplishment. Our approach also differs from Inselberg’s multidimensional data detective work, which focuses on analyzing existing visualizations, and from data activism approaches that emphasize community engagement.

We visualized our investigative approaches as journeys: beginning by defining a critical problem or question, identifying gaps (“What’s missing in the picture?”), searching for data, and confronting barriers—missing, partial, or deliberately obscured datasets. Each step provided unique insights into the relationships between data, society, and power dynamics.

Examples

Throughout the semester, students undertook diverse projects with strong societal relevance, including topics such as gender bias in politics, barriers faced by women in entrepreneurship, the functions and ideology of pockets constrained by historical gender roles, and gender representation within STEM academia. 

One student examining women’s representation in political institutions vividly illustrated the practical challenges of data detective work. Initial exploration quickly highlighted systemic data gaps as key datasets were fragmented or unavailable. The student navigated through a frustrating landscape marked by opaque official sources, partial records, and silences. Despite challenges, this data detective journey offered significant emotional and intellectual rewards. The student discovered patterns of marginalization, for instance, women are frequently relegated to peripheral roles rather than core decision-making positions. Each painstakingly gathered dataset provided clarity about structural inequalities. Ultimately, the effort became a tangible act of resistance against invisibility and marginalization.

Figure 2. Data Detective journey created by © Ruishan Wu.

Another student explored the challenges women encounter in achieving tenure in Canadian academia. Initially optimistic, the student encountered considerable barriers, including incomplete or outdated datasets and inconsistent categorization across institutions. Interviews became essential to fill these gaps, highlighting how data detective work can require alternative methods beyond computational data collection. The journey revealed systemic biases: women disproportionately assigned tasks correlated with lower job satisfaction and hindered career progression.

Figure 3. Data Detective journey created by © Haidan Liu.

Working with both big data & personal data

As we moved through the process, we found ourselves blending two approaches to data visualization that are often kept separate: working with big data and working with personal data. Big data showed up in the external datasets we chose to investigate, such as government records, institutional statistics, or public health databases. These are the kinds of large-scale, structured data commonly associated with the term big data.

On the other hand, personal data and visualization came into play as we reflected on our own experiences navigating these data landscapes. By documenting our paths through note-taking, diagramming, and visualizing our steps, we deepened our understanding of the datasets themselves and uncovered what was missing, what was hard to access, and where our questions should lead next.  

Central to our pedagogy was encouraging students to critically reflect on their data practices. We structured reflective exercises to surface the implicit power dynamics in data collection and usage. Students were prompted regularly to question: Whose data are we using? Who collected it, and for whose benefit? Who controls access, and how does that affect analysis?

This reflexivity deepened our critical engagement, enabling us to overcome technical challenges and interpret the implications of their findings from the data.

We suggest one possible pathway to actively take on the role of being a Data Detective:

  • Initially clarify what one is looking for—this is before one has the data.
  • Develop a timeline starting from the current moment, which will track the process by which one gains or loses access to the data.
  • Choose a currently promising direction to find more information (could be: ask a person, search on the web, go to an institution, etc.)
  • Collect and reflect on the information collected, filling in one’s timeline, with data, facts, responses, including emotional and frustration level responses.

Actively conducting Data Detective projects in our class, where we used personal visualization of our detective process to teach us about both institutional and personal data, whilst revealing many factors about our society. Each data point gathered and each visualization created represents a small act of making the invisible visible, contributing to more equitable and inclusive understandings of our complex social world.

Acknowledgement

We thank our colleagues and reviewers for their thoughtful comments. This research was funded in part by NFRFR-2022-00570 (A Co-Design Exploration), NSERC Discovery Grant: Interactive Visualization RGPIN-2019-07192, and Canada Research Chair in Data Visualization CRC-2019-00368.

CategoriesData Literacy

The post Exploring Data Detective Practices as a Class Activity appeared first on Nightingale.

]]>
24233
The “Dashboard” is Broken https://nightingaledvs.com/the-dashboard-is-broken/ Wed, 16 Apr 2025 16:52:29 +0000 https://dvsnightingstg.wpenginepowered.com/?p=23397 The value of dashboards has eroded. When executives hear the word “dashboard” today, they envision standard charts in BI platforms—obligatory elements for meetings rather than..

The post The “Dashboard” is Broken appeared first on Nightingale.

]]>
The value of dashboards has eroded. When executives hear the word “dashboard” today, they envision standard charts in BI platforms—obligatory elements for meetings rather than catalysts for insight.

Business leaders once championed dashboards as windows into organizational performance, but they became too familiar, too technical, and the value diminished. As evidence, one look at the relationship between those with roles in “business intelligence” in comparison to the business leaders they serve shows the massive gap in seniority, influence, and wages.

How did this happen? Let’s discuss these 3 ideas:

  • Dashboard rot devalued BI
  • Data people were never trained in design or communication
  • D3.js is complicated

Dashboard rot devalued BI

Business leaders scrambled to use data to inform the C-suite, and in the process, multiple layers of the organization had their own dashboards. When BI software became a premium license, it was only a matter of time before enterprises began counting which dashboards were used and which had never been used. The overwhelming under-utilization of dashboards across an organization led to the term “dashboard rot” which is a fundamental misunderstanding of what the value was in the first place. It’s like counting all Word documents in an organization vs what is published. The value has always been in the insight, not in the number of documents.

The way BI software was monetized ended up devaluing its own importance. Dashboards became an IT cost-center in many regards instead of a strategic advantage. It became a burden in the organization, and in many organizations, “reporting” was seen as boring and a potential waste of time.

Thinking of the value of BI differently, if a dashboard can make a $1M decision easier, is it worth $1M? If, over its lifetime, it supports a $5B company for running its business daily, does that still make it worth $1M or more? On the contrary, organizations don’t think of investing in software in the same way: software is a strategic advantage, but dashboards are just the cost of doing business.

Data people were never trained in design or communications

Maybe part of the reason why dashboards instill a certain amount of hesitation is because most are not well designed. Many people working in analytics come from data science, data engineering, or data analysis backgrounds, and those fields lack significant design or communications training. While it is impossible to say all dashboards are badly designed, I’m certain that most people who create dashboards do not consider themselves to be “good designers.”

There’s a big difference between the kind of high-level graphic design we see in advertising or in consumer apps and the kind of important tweaks that could easily elevate most dashboards. In fact, most dashboards can probably get a significant lift by adjusting the language used in titles and labels alone.

The success of data literacy programs proves the importance of training people in more than just foundational data visualization practices.  This shift—if we can make it one—from data towards communication might see the value returned to business intelligence, ushering in a new generation of thought partnership between analytics professionals and organizational leadership.

D3 is complicated

The reason why BI software exists is because custom coding charts was difficult. When D3.js was invented, an entirely new way to draw shapes in the browser created new opportunities to visualize data from simple charts to multidimensional interactive tools. But developing charts with D3.js was far from straightforward and pushed it into the domain of software development.

While it is not the fault of D3 that dashboards have lost their zest, the complexity of doing this work opened the door for faster (and therefore cheaper) tools to take its place. Many frameworks to create interactive charts for business sprang up each with their own tradeoffs, each focused on their own flavor of front-end, and in the process, the software design was assigned to the UX designer. I’m a former UX designer, and I can tell you definitively that data visualization and data communication simply does not exist in user experience design—despite the fact that almost all software design is a visualization of data.


Maybe it’s time we drop the idea of dashboards and focus instead on data communication? By adopting this shift we might just recontextualize the power of data.

There’s a lot here to discuss, so please let me know what you think!

This article originally appeared at: https://www.linkedin.com/pulse/word-dashboard-broken-jason-forrest-agency-aco1e

The post The “Dashboard” is Broken appeared first on Nightingale.

]]>
23397
Step 8 in the Data Exploration Journey: Build https://nightingaledvs.com/step-8-in-the-data-exploration-journey-build/ Thu, 29 Feb 2024 17:07:33 +0000 https://dvsnightingstg.wpenginepowered.com/?p=20103 This article is part 9 in a series on data exploration, and the common struggles that we all face when trying to learn something new...

The post Step 8 in the Data Exploration Journey: Build appeared first on Nightingale.

]]>
This article is part 9 in a series on data exploration, and the common struggles that we all face when trying to learn something new. A list of previous entries can be found at the end of the article. I began this series while serving as the Director of Education for the Data Visualization Society in 2022, because so many people were asking to hear more about data exploration and the process of learning data vis. What began as an exploratory project on the “State of the Industry Survey” data grew into a 1.5 year project that produced a 30-page 2023 “Career Portraits” publication (DVS member login required). This series gives an inside view of the project, illustrates my process for approaching a big project, and demonstrates that no “expert” is immune from the challenges and setbacks of learning. Let’s see where this journey takes us!

Where we last left off, my early discovery project on the DVS State of the Industry Survey data had morphed into what would become the Career Portraits initiative for the DVS. In Step 6, we got serious about reducing scope in the focus phase for the project, and in Step 7 we talked about how the right cuts can actually inspire new growth and expose new opportunities for collaboration. Now, it’s time to come back to the core project and start the long, uphill climb of the Build stage. 

A diagram illustrating the different phases of a project's lifecycle, represented in a diamond shape divided into four segments. Each segment is labeled with a different phase of the project: "Expand/Ideate" - This first segment, colored in purple, represents the beginning phase where ideas are generated and the project scope is expanded. It is marked with the note "Max risk of overwhelm," indicating that this is the stage where one might feel overwhelmed by the possibilities. "Focus/Consolidate" - The second segment, also in purple, follows the ideation phase and involves narrowing down ideas and focusing on the most viable ones. "Build/Produce" - The third segment, colored in blue, is where the actual product or project is built or produced. "Deliver/Deploy" - The final segment in blue represents the delivery or deployment phase of the project, marked with the word "Success!" A vertical axis labeled "Size of Project" shows an upward direction indicating growth or expansion. At the bottom of the axis are the words "Question, interest, or idea," and at the top, the phrase "Plan for what to do and how to tackle (at least some of it)" is written, suggesting that as the project progresses, a more concrete plan is formed. A black dot is placed at the intersection of the "Focus/Consolidate" and "Build/Produce" phases with an arrow pointing to it marked with "You are here," suggesting that the viewer is currently at the transition point between these two phases. The arrow also has a note "Max risk of exhaustion," warning of the potential for burnout at this stage. The diagram serves as a conceptual roadmap for project management, indicating that the project is halfway through its lifecycle and cautioning against the common risks associated with each stage.

The framework for this series is my version of the double-diamond model for design, first popularized by the British Design Council in 2005. The first diamond begins with the Expand stage, where all ideas are on the table. It’s all about discovery, innovation, and sketching ideas quickly. It’s fast, sometimes sloppy, and it’s usually skin deep. 

The Focus phase is the narrowing half of the first diamond. You step back and look at all the options, choose your core project direction, and focus on the specific tasks that need to be done. The Build phase at the start of the second diamond is where we shift into low gear and do the hard work of building the real thing. This is where craftsmanship and slow, deliberate effort really start to shine. It’s where discipline, hard work, and the pursuit of perfection come into play. At the end of the Focus phase you should have a clear plan; Build is where you execute.

Of course, in reality a project oscillates back and forth between expand and build phases throughout its life cycle, but the prevailing goals of the Build phase are ensuring accuracy and optimal quality and making sure that we get our project to the finish line. 

A humorous representation of the emotional rollercoaster often experienced during a project or creative process, depicted as a line graph with peaks and valleys. Each peak or valley of the graph is labeled with a phrase that characterizes a typical stage of the emotional journey: "Start. So excited!" - Initial enthusiasm at the beginning of the project. "Too many ideas!" - Overwhelm with the plethora of possibilities. "This is the one!" - Finding a great idea to focus on. "Really need to focus." - Recognizing the need to concentrate. "Crazy new idea!" - Distraction with a new concept. "Ok, time to rein it in." - Deciding to limit the scope and focus. "Looking good; let's build it!" - Optimism upon starting the actual work. "It works! Let’s release!" - Success in creating a working model or prototype. "Hmm. Well, that’s not going to work." - Encountering a setback. "I think that does it?" - Tentative solution to the problem. "Oops. Forgot this other thing." - Realizing something was overlooked. "Maybe this time?" - Attempting a fix or solution. "Seems to work..." - Uncertain success. "So many options!" - Facing many possible paths forward. "Huh. So that’s what I was trying to do." - A moment of clarity. "Looking good..." - Building confidence in the project. "Yay! We made it!" - Celebration upon completing the project. "See? That wasn’t so hard..." - Reflection on the process, often with irony. The colors of the graph alternate between shades of purple and blue, and each label is connected to a point on the graph by a line. The graph illustrates the ups and downs of a project from inception to completion with a light-hearted take on the challenges and breakthroughs along the way.

For some people, the Build phase is pure joy. It’s a time to work your technical muscles and make clear progress toward defined objectives. This is where your achiever side shines. You get to check things off of your to-do list from the Focus phase, and you should always be making visible progress toward your goals. 

For other people, Build is a painful and boring slog. Coming back to our mountain climbing metaphor from a previous article, at the start of the second diamond you are standing at the foot of the mountain and you can’t even see the top. In that moment, you might panic and decide that it was more fun to plan the trip than it is to climb the actual mountain. You might be tempted to just turn around and go home. Or maybe you’re really excited when you first start the climb, but then your muscles start to hurt in the first 5 minutes and you wonder if maybe you’re just not cut out for this. If this is you, that’s ok! You might decide to spend your time in Expand instead, or you might experiment a bit to see how you can make Build work better for you. 

If you talk to people who actually climb mountains, they will probably tell you that planning is the least rewarding part of the experience: there is no replacement for actually being there. Yes, it’s hard work, but they’ve hooked into other rewards that make the effort part of the joy. They will tell you that you need a good plan that’s matched to your strength and endurance, along with the courage to start and a willingness to embrace the discomfort that is part of every climb. After that, you just take one step after another, lean into the effort, and keep going until it is done. It will be hard, and you will be tired. You will fight your own resistance along the way. You will need to push through all of that to get to the other side. There’s no shame in turning back – sometimes it’s the wisest choice, especially if you’ve misjudged your skill level – but this is your mountain and you will need to climb it if you want to get to the top. 

It’s important to realize that the work often gets easier as you go, because the view and sense of achievement start to pull you along. You hurt less as your muscles warm up and the endorphins kick in, and you start to find a rhythm in the work. The people who excel at Build are the ones who have enough confidence in the journey to get through that initial discomfort, knowing that there is something worthwhile on the other side. They might even enjoy the hard work and find pleasure in the challenge of pushing their own limits. Skating over lots of ideas and imagining what we could do might be really fun, but it is the effort and reward of creation that really lights Builders up. 

Neither the Expand nor the Build phase is better or worse than the other. They require different skills, and appeal to different people. Sometimes a single person is able to master and enjoy both phases, but most people lean more toward one or the other. If you’re a natural at one and struggle with the other, that’s pretty normal. 

Switching metaphors, it doesn’t really matter which hand you prefer to write with, as long as you end up with a similar result. Still, most people will choose one hand over the other for certain tasks. There are few truly ambidextrous people out there, and most of them put in significant, conscious effort to train their second hand before it can be useful for precise tasks. Expand and Build are just different strengths.

If Build isn’t your thing, have patience with yourself, and try to enjoy the challenge of learning something new. You may never get to a point where you are equally strong in both phases, but time and practice will help you to become more comfortable working in the one that comes less naturally to you. 

Things you need in the build stage

Focus, and a plan.
Coming out of the first diamond, you should have a clear plan for what you need to accomplish. Know what needs to be done, and put all of your energy into doing it.

A realistic sense of your own capabilities and strengths.
You would be crazy to summit Kilimanjaro without training, and you shouldn’t expect to be creating masterpieces out of the gate in data vis, either. Know your skillset, and choose the project that pushes at the level where you truly are, not where you wish to be. If you’re in over your head, the safest and smartest thing is to turn back, or find someone who can help.

Discipline.
This is the time to do things right. No shortcuts, no “I’ll come back to this later.” You don’t want to leave a bunch of holes in your final deliverable. This is where you clean up and resolve all of the things you left dangling in the Expand phase. For many, this is the hardest part of Build; you can’t defer the things you’d prefer not to do any longer. If you haven’t been cleaning up loose ends all along, now is the time to deal with them. Think how much better you’ll feel when they are resolved!

A commitment to excellence and craftsmanship.
Build is where you take the time to do your very best work. If you are a natural Builder, your inner perfectionist may be feeling traumatized by the fast and loose approach we took in the Expand phase: this is where that part of you can retake control and really shine. Just make sure to keep things positive, and not self-defeating – it was your job to explore during Expand. A bunch of loose sketches and semi-realistic ideas is actually what perfection means in the ideation stage! This is where you take those half-finished sketches and rework them into something real. If your Builder is feeling completely exasperated with your Expander, that might be a sign that you’ve overcommitted or that you’re not setting realistic goals. This is a good time to check in and head back into Focus if necessary.

The ability to say no.
There will be times where you’ll be tempted to go “just a little bit further” or add one more thing. This is your judgment call, but it’s important to keep focused on your goals for the project and only add what you need. If you tire yourself out on the side trails, you may not make it to the peak. If you get to the end early and still have energy, you can always go back and explore on the way back down. Compulsively adding too many things during Build is a common reason that people burn out and don’t make it to the end of the project. It may also be a sign that you’re jumping back into Expand too readily, and not sticking it out with Build. 

Enough time.
The build work should start as soon as possible; don’t leave it all to the end or push yourself up against a deadline. Nobody does their best work when they’re under the gun: that creates prime conditions for your inner perfectionist to freak out and melt down rather than help. If you forgot to account for the time it actually takes to do the work, then you should go back to Focus and revise your scope accordingly. If you over-committed, now is the time to admit it and beg forgiveness. This is where you earn your own trust. You can insist on just powering through no matter what, but know that it will be harder to let your vigilance down enough to succeed in the Expand phase next time if you do.

Self-knowledge, and compassion.
You should be pushing to your limits in Build: this is where you achieve creative growth. Doing that without injury requires that you know your strengths and what you need to get through a difficult challenge, and that you know when to stop. If you find yourself always spinning out or pathologically avoiding the Build phase, a habit of pushing yourself to injury is probably why: some part of you knows that it’s not safe, and it doesn’t trust you to go there. 

Energy breaks.
You’re at maximum risk of exhaustion in the build stage. Remember to stop and do something else from time to time. I like to have a different project in Expand while I’m working on Build so that I can switch gears and do some ideation or sketching when the Build work gets hard (without adding to my current scope!). This is a great place to ideate on Part II of your project, so that you’ll have some ideas ready when this one is done…just be sure that doesn’t turn into increasing your scope for Part I. Switching into Expand mode on a subtask also helps to break things up. Working in different phases is often more effective for me as a “rest” period than taking a complete break: I can rest the mental muscles that are tired, without having to leave flow. 


Remember that the Build phase should be a constant conversation between final deliverable and process. It’s really important to take your time here. Work through the problem again all the way from the beginning, taking advantage of all the things you learned in Explore. You may need to step back into Expand or Focus for a little while as you refine the detailed picture of where you’re trying to go. 

Try to get through Build without making major changes to your plan. “Easy come, easy go” is an Expand mentality. It shouldn’t be necessary to throw everything out and start over at this point. As you get deeper into Build, you’re putting in work that’s going to be painful to toss out. Instead of scrapping everything when the going gets tough, lean into it, focus on the goal, and push through. That said, you should expect your picture to shift a little as you get more information about what is (and isn’t) possible, and as you tie up all those loose ends. If you find yourself tempted to run for the exits, that’s probably a signal that you’re overdoing it. Take a rest and re-evaluate, or escape into Expand for a while and then come back.

Now that we’re acquainted with the Build phase in general, let’s take a look at what was happening in the Career portraits project in this stage. 

Content Creation

Clean up the data.
In Focus, we decided on the final variables to use and comparisons that we wanted to make. For Build, we needed to recalculate our values, adjust units, and re-aggregate several of our analyses to allow slightly different comparisons in the final document. We spent a lot of time going back and forth over whether to show counts or percentages, and when to show both. We also checked (and re-checked) our code and our results to make sure that our numbers made sense. 

Get clearer on the details.
We knew what we wanted to show in the data vis, but we still needed to choose the specific visualizations and define our tools and approach for creating them. We did a brief Expand phase to review sketches for different types of visualizations to evaluate for strengths and weaknesses and feasibility, and then focused back down quickly to the ones we knew we could build in the time that we had.

Production

Think through the format of your final deliverable.
We knew we were creating a digital pdf, but we still had to think through the page size, layout specifics and implications for font sizes, labels, chart formatting, etc. All of these affected the details of our chart and page designs, and set some hard limits on how detailed the visualizations could be.

Pick your technology(ies).
Jenn (my collaborator) and I had a mix of different skills between us, and we didn’t want wrangling with code to slow us down. In the end, we built most of the visualizations in R and then annotated them in Illustrator. Some charts were used directly from R, others were calculated in R and re-built in Illustrator from scratch, and still others were made in Figma. In some cases, D3 or another solution would have been a lot better as an end-to-end tool for production, but for a one-off print publication it was much faster for us to build from where we were and with what we knew well. This hybrid approach required additional manual work and cleanup, but it gave us more control over the final formatting than we could get easily from code. 

Build out the charts.
Once the chart types and general layout were selected, we still needed to build the visualizations and calculate the actual values for our analyses. This required another round of code edits in R, and some additional exploration about how to export and edit the charts once we were done. Jenn was able to do a lot with base chart theming in R, but there were still some visualizations that required manual work. It turns out that R can’t export editable text labels, so anything we changed had to be re-typed in Illustrator by hand. The labels alone took more than 30 hours of work to clean up. We understood that cost up front and we accepted it, because for this project it was the simplest way to get to our goal.

Generalize and clean up the code.
As we worked through the final details of the analysis, there were several opportunities to go back and restructure the code to make it more consistent. This helped to make it clearer and more robust, and it also cleaned up a couple of minor calculation errors that we might not otherwise have caught. We chose to do this step even though we weren’t planning to productionalize the analysis, because we wanted our code to be readable for use in future projects and we wanted to make sure that we caught any mistakes or bugs in the data.  

Finalize your narrative.
We had a pretty good sense of the overall discussion we were interested in and the metrics we wanted to show, but you can’t actually finalize your narrative until you’re sure that the data is solid. We re-wrote most of the supporting narrative and revised our document structure more than once as the data calculations completed, to be sure that our comments and the data details were aligned.  

Identify new analyses or content, and prune as needed.
As our picture of the report became clearer, we realized that there were some additional metrics that we wanted to include. We considered the full set of options and did another round of Focus to finish things up. Some of these were larger than we could afford, so we put those off for another day. We also removed things that no longer fit. We’d planned to include an analysis comparing answers from independent visualizers and those employed in organizations, but when we got into the details, the branching structure of the survey and distribution of responses made it hard to compare those populations in a meaningful way. It would have been possible to rework the analysis to include that comparison, but it would have meant going back to square one. Reluctantly, we added this to the list of follow up projects that we could return to later. 

Assemble the final document.
Once the analysis, text, figures and other content were complete, they needed to be assembled into a single document for publication. This process alone took a couple of months and went through several rounds of revision. We used InDesign for the layout and imported the images from files to support the many, many edits and refinements required as we worked through micro edits for the final doc.


There are a lot of pieces involved in Build, and it can be difficult to navigate all of the loose ends and get to something that you’re proud of. The flip side is that you get to see the work develop and grow into its final shape, and there is a lot of pleasure in creating a solution that feels well-resolved. In the next article, we’ll take a closer look at some of the individual choices going on in this stage of the project, and how practical and editorial choices came together to shape the final document.

Previous articles in this series:

Embrace the Challenge to Beat Imposter Syndrome
Step 1 in the Data Exploration Journey: Getting to Know Your Data
Step 2 in the Data Exploration Journey: Going Deeper into the Analysis
Step 3 in the Data Exploration Journey: Productive Tangents
Step 4 in the Data Exploration Journey: Knowing When to Stop
Step 5 in the Data Exploration Journey: Collaborate to Accelerate
Step 6 in the Data Exploration Journey: Cut to Realistic Scope
Step 7 in the Data Exploration Journey: Spin Off Projects

Related links:

Early Sketches for Career Portraits in Data Visualization, by Jenn Schilling
DVS Careers in Data Visualization, YouTube Playlist for interview series by Amanda Makulec and Elijah Meeks
Career Portraits project (DVS Member space login required)

CategoriesHow To

The post Step 8 in the Data Exploration Journey: Build appeared first on Nightingale.

]]>
20103
Book Review – “The Data Storyteller’s Handbook” https://nightingaledvs.com/book-review-the-data-storytellers-handbook/ Tue, 13 Feb 2024 16:45:00 +0000 https://dvsnightingstg.wpenginepowered.com/?p=19900 Learn more about Kat Greenbrook's latest book while diving into her process behind its creation.

The post Book Review – “The Data Storyteller’s Handbook” appeared first on Nightingale.

]]>
Kat Greenbrook is a data storyteller from Aotearoa, New Zealand. Through Kat’s online presence and shared material over recent years, I’ve been an admirer of both her graphical style and industry-leading content in the field of data storytelling. So, when I learned she was writing a book, I jumped at the chance to talk to Kat to learn more about the process behind the book and its contents.

Our (edited) conversation is below – I hope you get as much from it as I did!

Neil Richards (NR): Tell me a little bit about yourself and what your role is. How did you become a data storyteller?

Kat Greenbrook (KG): Okay, I’m not going all the way back but… when I left university, I worked for a good ten years in analytics. I was a SAS coder and built predictive models for large organisations. But I was really frustrated with the job I was doing. I’d build these models, spend months on projects, and the output of all that work didn’t really go anywhere. It wasn’t creating the business impact I was hoping to achieve. So, I thought I’d try to change my career.

I wanted to get out of analytics and try something different, so I re-trained in digital design. I did a design degree over two years. I was working as an Analytics Consultant at the time and was intent on leaving the field. I planned to become a graphic designer and do something completely different. But as fate would have it, around this time, data visualisation started to get popular. This was the infographics era, and I was in the right place at the right time to jump on board that train.

I leaned quite heavily into data visualisation as a solution to engaging people with analytics. I thought that if I could make something pretty enough, it would have more of an impact and that data would be able to change things a little more.

It didn’t happen that way – the more I worked in data visualisation, the more I realised that data visualisation will only get you so far. It was a communication problem rather than a design problem. So, I set about trying to improve my communication skills — just in general. I went to a science communication conference and listened to someone talking about narrative structure. It was the very first time I’d ever heard that term. And that was my lightbulb moment — when I realised what was missing from my data visualisations, and what would help me answer that, “so what?” question. You know when you create a visualisation and people are like, “it’s pretty, but so what?”. That’s the first introduction I had to narrative, and from then I haven’t looked back.

An infographic which reads the different data story types. A summary data story, character type, is illustrated into summary act one, nested data story, summary act two and summary act three.
Example page featuring practical resources and signature Kat-style graphics

NR: It’s such a familiar story. I also started in a sort of background data processing role, similar to you but not as clever! You have that first realisation when you wonder whether anyone is actually looking at this data or doing anything with it. Then you take that next step into data visualisation and feel like you’re closer to that point of trying to explain data and get people to understand it. You almost have to have that second realisation that says, “we still need to help people to bridge that gap”. I love the fact that, again similar to me, you’ve just come across the idea of narrative quite recently, because if you perhaps come from a STEM background and don’t come from a traditional story background, it’s a whole new way of looking at things.

KG: Absolutely, it’s a different perspective that completely changes the way that you view the work than you do.

NR: It’s such a hot topic – I’ve been up and down in the past on the idea of data storytelling and I’m very much “up” on the idea at the moment, but do you tend to get pushback on it? Or the idea that storytelling is some kind of term that people might dispute what its meaning or usefulness is?

KG: Yes, especially in the early days. I started running data storytelling workshops around six years ago and in the first couple of years there were a lot of pushbacks. Some of it was from people who had gone through certain educational pathways: those very scientific, very analytical pathways. In these you get taught not to introduce any bias in how you communicate data. And then here I am, telling people to pick a message, to communicate a message! For them, it was a real case of, “no, that’s not what we’ve been taught, that’s not what we’ve been trained to do”. I also felt that way to begin with. My degree was in science. I come from that educational background as well. So, for me it was also a shift in my thinking. 

Some of the pushback also came from people not really understanding what it means to do data storytelling. Sometimes, people think of it as this creative, fictional thing, almost like you’re cherry-picking data to support a predetermined narrative. But this is not good data storytelling practice.

NR: No, of course not. I suppose people might interpret storytelling as being like a fable, a fantasy, or a lie. I think narrative is so key – getting the narrative structure and explaining that that’s how you need to get your message across. You’ve been doing the workshops for about six years. What made you decide to write a book?

KG: [nervous laugh and pause]

NR: I love the pause – as someone who’s been through that himself, it’s probably a case of, “yes, what was I thinking?!”

KG: Yes – why did I do this?!

A woman in a maroon sweater and a man in a green shirt join in a laugh over an online video call.
Our reaction to the idea of why either of us thought writing a book was a good idea …

It was initially to complement my workshops because I was constantly asked, “what can I read more of, how can I upskill, how can I build on what I’ve learnt today?”. I had a list of recommended resources but there wasn’t anything that I could point to and say, “everything we’ve talked about in this workshop, you can find here”. So, I wrote the book to build upon what I cover — so people could go a little deeper, learn at their own pace, and have that resource available.

But I think there are lots of reasons to write a book, and one of my personal ones is to position myself a little more as an expert in the field. I’m based in New Zealand and it does feel very isolating all the way down here! It’s been amazing, actually, having written a book. I feel much more connected to so many people around the world. It’s really opened my network, which I’m enjoying.

NR: Yes, you’re talking to a guy who has a copy of your book in his hand, twelve thousand miles away. It’s a nice feeling isn’t it? [Edit: apparently Wellington, New Zealand is 11602 miles away from my village as the crow flies, I’m pretty pleased with that guess!] I love the book – I follow you and am aware of your work. I haven’t been to your workshops, but I’ve used a couple of your slides before. Whenever I have one of your slides, I know it’s a Kat Greenbrook slide. You can tell from the colours, you can tell from the design; you have such a unique design style, and so this book has a lovely aesthetic to it. You said earlier that you were really interested in going into graphic design. Was this a chance to put some of your graphic design work on show as well in the book?

KG: Yes, the work I’d done to date did lean very much into my graphic design background. That’s just how I like to tell stories, using those kinds of visuals rather than the more traditional BI platform visualisation tools; I prefer more graphic design tools. But the look of the book held me back for a long time — it was a roadblock! I knew I wanted to create a visual book. I had this idea of a kids’ style book for adults, and that was a vision I couldn’t really get out of my head. I wanted it to be friendly when you pick it up. Some technical books can seem quite daunting, and people look at them and think, “that’s not for me,”. I wanted my book to be very accessible. I knew the visuals were going to play a big role but I just had no idea what they were going to look like; I didn’t have a style.

An illustrated boy plays with a bars from a bar chart tied up in a leash.
Another page with signature style graphics and clear take-out message.

I think it was at the end of 2022 – I had a conference that I was presenting at and thought, “I’m just going to experiment with a completely different look”. I designed those little people for the first time, and I got really good feedback on it! So, I thought, “okay, this is obviously resonating, I’m going to run with this”. I started creating them in a way that I could productionise them. As you know, there are a lot of little people and designs throughout that book, and I needed a way to make it easy for me to design. I created a bunch of different kinds of arms, different kinds of legs, different kinds of hair, and so on.

NR: What I like is that you say you want it to be not too daunting: it’s almost four hundred pages. It’s a lovely sized book, but you’ve made it in such a way that it’s really clear and readable on every page. You wanted it to be skimmable as well as readable from start to finish. Your introduction in the book even says, “feel free to jump around in this book.” It also says that’s the way you read books as well – to buy books with the hope of magically absorbing the content! I feel like you’ve achieved that really well. I’m also personally guilty of buying books which go on the shelf and I maybe don’t get round to reading them, but I keep picking your book up and reading more every time before putting it down; I feel like you’ve absolutely nailed the way that you wanted to do this. I had to almost make myself go back and read through from start to finish in order to fully read it for this chat. But it very much feels like every page or opposing pair of pages is like a learning point, like a leaflet or a slide that you might see.

KG: Oh good! That was very much an intentional design thing. I wanted it to be very skimmable because that’s how I read books. I wanted it to almost be like Instagram – you can look at a page and feel like you can just take what you need from that page. I wanted it to be shareable on social media and I’m hoping that happens.

NR: I’m sure it will – I shall be sharing my thoughts on social media, certainly! Who would you say is the typical audience for your book? 

KG: I’d like to say it’s a super general audience for anyone who needs to communicate data — and it will certainly help this audience! But I think I wrote this book specifically for people in analytics roles who are struggling to communicate their data outside of a dashboard. Typically, within large organisations there are lots of people doing analytics and they’re probably very skilled in designing dashboards, because that seems to be the default way of communicating data. But I think organisations are now asking for just that little more detail, that little bit more explanation. People in these roles are being told to “make your data tell a story”, and they might have no idea what that means. I wanted to have this, almost like a playbook, a handbook to say step-by-step, “this is how you do it; this is what it means”. That’s my particular audience group.

NR: I feel like there are parts of this book that are giving us those building blocks, those tools to find the story before telling it, because I think we’ve mentioned the importance of narrative and telling the story, but I feel sometimes there can be that gap where you don’t actually know what the story is, or how to find it, so I really like some of the tools and frameworks you’ve given us, your readers, in order to do that.

KG: Thank you! Yes, it’s trial and error, and it’s what I’ve learnt through running workshops and through doing this myself — just experimenting with what does work.

NR: How did you enjoy the book-writing process once you got started? Was it a lot more solitary than the workshops that you’ve been giving?

KG: Absolutely! But I’m an introvert, so I quite enjoyed the whole process of locking myself away and forcing myself to write. Some days were really hard! I think any writer, and you’ve probably experienced this as well, will know what writer’s block feels like – you just feel like you’re hitting your head against a brick wall trying to get something to come out. But most days were pretty good. I enjoyed switching between writing and design as I was doing both at the same time. So, if I was having more of a design day, I’d focus on the images. If I needed to think a lot about a subject, I’d say this is a writing day and drill down into the subject. I found it was a nice opportunity to go deeper into things that I thought I knew well. As a writer, you come out of the process as an expert, you don’t necessarily go in as an expert. Just having the opportunity to think deeper about what I was already teaching – I really enjoyed that process. It’s very different to running a workshop, where you’re “on” all day. Writing is very solitary.

NR: Yes, I love what you just said there – you don’t necessarily go in as an expert, but you come out as an expert. I think that’s something really encouraging to any people who might want to write a book. I certainly don’t feel like an expert on anything other than my own process, but that’s what I wrote about in my book. I felt I improved my understanding and got more clarity on the way I did things and my ideas just going through the process. It is actually a good way of learning yourself, and then passing on that learning to other people.

I for one will certainly be using a number of your methods when it comes to trying to teach data storytelling in my organisation. With that, actually, I’ve noticed you’ve made an awful lot of material and resources free and easily available to anyone who needs them. Can you remind me where that is – is that through your website?

KG: Yes – that’s all through my website, www.roguepenguin.co.nz. I wanted to make it easy for people. I didn’t want people to pay extra to get templates. If you buy the book, you should be able to create that data story and the process should be free of friction. I wanted to make it easy for people to do that.

NR: How did you come up with Rogue Penguin?!

KG: I always get asked this! 

NR: Sorry, it’s a completely random question! I mean a penguin, we love and associate with New Zealand, but why rogue? Was that because you were striking out on your own, I guess?

KG: Yes, a little bit. When I started the company, I was going through all this brainstorming – as you do when you’re trying to start a company – trying to figure out what works best. At the time, there were a whole lot of other companies that had “data” in their name. They were very technical-sounding and (I thought) kind of boring. So, I wanted to stand out from that. My idea of going out on my own was to do things differently. I wanted to shake things up a bit and so the word “rogue” really resonated in terms of what I wanted to do with the company. And “penguin”, yes we have loads of them in New Zealand, but it was also my daughter’s favourite book at the time; she was five. Every night I’d read her this book called Penguin. It was very meaningful at the time, and Rogue Penguin was born!

NR: Do you have one stand-out piece of advice from the book over anything else? Is there one overarching message that we can take from the book? 

KG: Yes, it’s to understand your data story before you try to tell it. A lot of what I see today in [air quotes] “data storytelling” is just data visualisation. It’s missing the story. But it’s hard to visualise a data story if you don’t know what it is. So, my big takeaway is to understand your message.

Understand the story you want to communicate because it’s then much easier to create visuals to support that narrative. I think that’s what’s missing from a lot of what people are calling data storytelling.

NR: Yes, I agree, and there’s so much conversation about it. Even today, even this week, on any of the social platforms that you choose. I love the fact that data storytelling is the topic that everyone’s talking about, and in a way I don’t mind if people have slightly different interpretations of it, but I think more and more people have come round to the importance of data storytelling – I certainly have. Just as we were talking about earlier – to get that narrative, to help explain, and to help lead to those data-informed decisions.

KG: That’s it. Organisations these days have so much data that they need this translator role, which is where the data storytelling fits in. I think analysts who have gone through the analysis process are in the best position to tell the story of data or explain what it means. They shouldn’t shy away from being that person as that’s how they’re going to create impact with their analytical outputs — good communication is what’s missing.

NR You’re right, it’s so key, isn’t it? That’s why I feel this is so important. You’ve described yourself as an introvert earlier; I describe myself as an introvert, and probably most of us in the analyst world describe ourselves as introverts. These discussions and these tools that you’re helping bring to us in your workshops, and now your book, are what help us, the analysts, tell the crucial stories.


You can order Kat’s book on Amazon.


Disclaimer: Some of the links in this post are Amazon Affiliate links. This means that if you click on the link and make a purchase, we may receive a small commission at no extra cost to you. Thank you for your support!

CategoriesCommunity Reviews

The post Book Review – “The Data Storyteller’s Handbook” appeared first on Nightingale.

]]>
19900
Divisive Dataviz: How Political Data Journalism Divides Our Democracy https://nightingaledvs.com/divisive-dataviz-how-political-data-journalism-divides-our-democracy/ Wed, 10 Jan 2024 19:01:10 +0000 https://dvsnightingstg.wpenginepowered.com/?p=19638 Recent dataviz research shows how the most popular tropes in political data journalism can distort the democratic process.

The post Divisive Dataviz: How Political Data Journalism Divides Our Democracy appeared first on Nightingale.

]]>
Democracy in the United States is under threat. Are we, humble data visualization designers, to blame for this?

Not exactly… But we’re not as innocent as you might think. 

Data stories have a unique role to play in a well-informed democracy. Unscrambling messy social issues requires data. Informing policy discussions requires a quantitative, analytical approach. So political data journalism can be a powerful tool for encouraging an enlightened electorate. 

But in practice, political data stories are dominated by infotainment like election polling. Polling in the news is like reality TV for people who like spreadsheets. It’s fun to see how crazy other people can be; polls simply offer this voyeuristic thrill through charts instead of on-camera confessionals. However, despite being a popular guilty pleasure, polling coverage isn’t particularly enlightening. It’s also more than a little toxic, with real-life social costs. (For a primer on the psychology of partisanship and how dataviz plays a roll, read my Nightingale story, “Through a Partisan Lens: How Politics Overrides Information.”)

Three recent research projects demonstrate these costs, highlighting unintended social consequences of political dataviz. 

These findings are anxiety-inducing, especially as we approach an election season with so much at stake. But political dataviz doesn’t have to be toxic. When it’s focused on enriching information, rather than engaging headlines, it can be a powerful tool for civic enlightenment. 

“Red States vs Blue States”

For all the grief caused by the U.S. Electoral College, it has one enjoyable side effect: Every four years we get a cartographic bonanza of election coverage and news sites sprout forth a fresh crop of creative election maps, like an endless field of red-and-blue wildflowers to comfort us in our times of civic despair. 

But, like the Electoral College itself, these maps are subtle agents of chaos. 

A collage of maps, showing different ways to represent state-level presidential election results. Most of these show results as a binary outcome, either as “red states” or “blue states.” Collage by Eli Holder.

The maps above all show state-level tallies for U.S. presidential elections. They have different strengths and weaknesses, but share an important common trait: States are coded as either red, blue, or some color suggesting “soon to be red or blue.”

But there’s no such thing as a “red state” or a “blue state.” 

Consider Texas, which is often called a “red” state. In the 2020 presidential election, more Texans voted for Joe Biden (5.26 million) than every other “blue” state, except for California. Even New York, a Democratic stronghold, had roughly 20,000 fewer Biden voters than Texas. 

In reality, large numbers of Democrats and Republicans live in all 50 states, even the ones that consistently lean red or blue. So states are all varying shades of purple (at least in terms of presidential preferences). 

While popular election maps accurately reflect the “winner-take-all” dynamic of the electoral college, they create the misimpression that state electorates are monolithic blocks of only-Republicans or only-Democrats. 

Experiment: Dichotomized maps and perceived voter influence

In early 2023, researchers at Harvard, the University of Wisconsin – Madison, and the University of Virginia ran a study to understand how state-level partisan stereotypes might impact voting perceptions. 

A reproduction of two stimuli used in Rémy Furrer & friends’ 2023 paper: “Red and blue states: dichotomized maps mislead and reduce perceived voting influence.” The map on the left shows presidential election results as a binary red-vs-blue. The map on the right shows the same results on a continuous scale of red-vs-white-vs-blue.

In the experiment, they showed retrospective maps of the 2020 U.S. presidential election to see what conclusions the participants drew from them. The maps were all geographic choropleths, but they used two different representations of the underlying vote: 

  1. Dichotomous values, where the election results were binary — blue if Joe Biden won the state, red for Donald Trump. This reflects the winner-take-all dynamic of the electoral college but doesn’t say anything about the actual balance of support within each state. 
  2. Continuous values, where results are presented as gradients. Darker colors represent the margin by which the candidate won the state, with white representing very close races. This paints a more nuanced picture, highlighting that some states are more evenly balanced in terms of party compositions. 

Findings

The dichotomous maps caused some trouble.

  • The dichotomous map led to increased geographic stereotyping. Relative to the continuous condition, people who saw the dichotomous maps tended to overestimate the winning candidate’s margin of victory. That is, they assume red states were made up of mostly Republicans and blue states were mostly Democrats, even if the win margins were slim.
  • The dichotomous maps also made people think their votes mattered less. This is actually a reasonable jump from the previous finding. If voters think that a state is dominated by one party or the other, then the election outcomes are a foregone conclusion regardless of how they vote. In reality, this is not a safe assumption. We should all vote regardless of the expected outcome.

The first finding confirms our earlier research showing that, when visualizing social outcomes, hiding variability can increase stereotypes about the people being visualized. In general, if charts don’t show variability, people tend to assume it’s not there.

The second finding is more important, though, because it impacts downstream beliefs and actions (like not voting). A prominent theory on voter turnout suggests that people are more motivated to vote when they think their votes will make a difference. For example, one U.K. study suggests that turnout decreases in districts that have been historically less competitive. That is, like the map study, viewers’ beliefs about how a district is likely to vote can impact turnout. 

Interpretations

There are two important takeaways here.

  • Social outcome charts have social risks. The dichotomous electoral maps are a super common convention for showing presidential election outcomes. At first glance, they don’t seem obviously sketchy. But unlike other topics we might analyze with data, when charts show outcomes about people, especially when they’re split by some social identity (like political party), we should be aware of unexpected side effects. 
  • People are not monoliths. People are weird, diverse, and complex, especially in their attitudes. Given any big group of people, even if they have one thing in common (e.g. the state they live in, their race, their profession, etc.), their attitudes and other outcomes will still vary widely within the group. If you don’t show variability in a chart or map, viewers will assume it’s not there and stereotype them, often to toxic effect.

Election forecasting

The maps in the last study showed votes from previous elections. However, most campaign coverage is forward looking, attempting to predict the results of an upcoming election. 

A bar chart, featured on the New York Times’ homepage on July 31st, 2023, showing election polling results for the 2024 Republican primary for president. How might charts like these influence Republican primary voters? 

The chart above is from the July, 2023 Times / Siena election poll, looking at U.S. Republican primary voters’ candidate preferences. This poll made the front page of the Times’ website on four separate days last summer. Their October, 2023 election poll enjoyed similar front-page treatment. Same for this latest poll from December, 2023.

Election polling often gets the top spot because it’s compelling content. As renowned American (data) journalist Philip Meyer explained, “The most interesting fact about an election is who wins. The most interesting thing about a campaign at any given moment is who is ahead.” 

Despite their popularity, election polls can be quite sketchy. So sketchy, in fact, that since 2016, Pew and Gallup walked away from the process entirely. Some countries like France, Spain and Canada have even banned polling results from the media in the days before an election. 

Most criticism of polling focuses on accuracy. This misses the bigger picture though. Even if forecasts were oracularly accurate, they’d still be sketchy.

  • Election polls aren’t particularly useful. For the same reason “all your friends are jumping off a bridge” isn’t a good reason to jump off of a bridge, a candidate’s popularity isn’t intrinsically revealing of how their governance might align with constituents’ values. 
  • Election polls can distort election outcomes. For example, researchers suggest Hillary Clinton’s lead in the 2016 polls made her supporters complacent and may have suppressed turnout enough to tip the election toward Trump. 
  • Election polls fuel the flames of the media’s “horse race” coverage. Researchers associate this type of coverage with a number of negative, downstream effects like 1) distrust in political leaders, 2) distrust in the media itself, and 3) an uninformed electorate.

Experiment: Can dataviz help manage election expectations?

In explaining the drama with election polling, a popular argument is “blame the audience!” If only people knew “how to read the polls” and could better incorporate error and uncertainty into their judgements, perhaps a lot of this fuss would blow over?! 

If the problem were purely about judging uncertainty, then we might be able to solve it with dataviz. So what would happen if we let leading researchers on visualizing uncertainty design election forecasts?

  1. They would use Matthew Kay’s Quantile Dot Plots
  2. Election polling would still be sketchy.

This year, Fumeng Yang and others at Northwestern won a VIS Best Paper award for their epic study of how even well-designed election charts can impact viewers during an election. 

Screenshot of four stimuli and the surrounding website used in Fumeng Yang & friends’ 2023 paper: “Swaying the Public? Impacts of Election Forecast Visualizations on Emotion, Trust, and Intention in the 2022 U.S. Midterms.” Each condition shows an election forecast using a different way to show uncertainty. 

Above are four possible ways to visualize election forecasts for the 2022 Georgia governor elections between Stacy Abrams and Brian Kemp. Each of these charts shows the vote share for each party and the probability that either candidate would win. 

Notably, charts A and D are closely related “quantile dot plots.” The only difference is that D has a related animation showing the dots filling in like plinko balls, giving it a more direct physical metaphor. Chart C is a gradient interval, where the darker parts of the band correspond to the areas with the highest probability. 

These charts have different strengths, but they’re all backed by prior research suggesting they’d be effective ways to communicate uncertainty. 

Findings

These charts did more than simply relay election forecasts. This research project is one glimpse into a very complex social process, so we’d need more evidence to say conclusively how participants’ responses might translate into effects on a real life election. But among a pile of interesting findings, the results show a few different non-informational influences on the study’s participants:

  • Voting intent. Participants, across all conditions, typically reported that the forecasts increased their intent to vote. At face value, this seems like a good thing. However, there were “slight” differences between the chart conditions (e.g. the trusty quantile dot plot led to the lowest reported intent to vote). And in some cases, the same chart seemed to have different impacts depending on the participant’s party (e.g. for participants who saw the interval condition, Republicans were more confident than Democrats that their fellow partisans would be positively influenced by the chart).
  • Viewers felt feelings. The charts sparked a range of emotional responses. When Republican candidates were predicted to win, Republicans felt good about it and Democrats felt bad. When Democrats were predicted to win, people from both parties felt surprised.
  • Trust in forecasts shifted. Finally, the way the charts were designed impacted viewers’ trust in the forecasts themselves. Some mistrust extended to the forecasters themselves: 10% of participants griped that the researchers’ election forecasting website was “immoral.” These trust effects were largely driven by whether or not participants’ prior attitudes agreed with the forecasted results. 

The impact on voting intent stands out because it potentially speaks to voter turnout. We have to be careful here though. Jessica Hullman, one of the paper’s authors, urges caution in linking their voting intent results to real-life voter turnout. Their paper shows this is possible but we’d need more evidence to say for sure. Prior political science research separately suggests that election forecasts can decrease turnout, but this is a hot topic among election experts. Having said that, experimental distortions in intent may still imply a risk of distortions in election turnout. If solving for these distortions were simply a matter of better visualizations of uncertainty, then we might reasonably expect one of the four chart conditions to either a) neutralize intent-related influences (on reported intent and emotions) or b) make sure intent-related outcomes are improved uniformly across parties and elections. This is a fairly impossible task though, so it’s not surprising that each of the four conditions showed that, even in an experimental setting, the story is more complicated. This implies that better visualizations of uncertainty aren’t an easy silver bullet for mitigating the potential distortions of election forecast data. 

The impact on trust is also noteworthy given increasing distrust in established media. Political journalists’ horse-race habits don’t help build trust, but this study shows that trust partially depends on whether or not participants’ own parties were expected to win (particularly for Republicans). Other studies show similar results, suggesting that the best way for election polling to seem credible is just showing people what they want to see. But this strategy of audience pandering hasn’t been very successful for some news organizations.

Also, in case it’s unclear, these results absolutely don’t absolve designers of the need to show uncertainty. The four charts they tested might have vastly superior outcomes to the status quo bar chart, but we don’t yet have the evidence to say either way. 

Setting aside media incentives, these charts are meant to simply inform viewers about likely election outcomes. Instead, they show multiple potential pathways for shaping the outcomes they attempt to predict. 

Interpretations:

  • Good dataviz can’t always save sketchy data. Good dataviz isn’t just about good charts. Good analysis and good data stories start with asking good questions. For election polls, the side effects might not stem from how the results are visualized, it might be that “Who is going to win?” is a sketchy question to highlight in the first place.
  • Data shouldn’t always be visualized. The AP Stylebook on polling suggests that “the mere existence of a poll is not enough to make news.” Election polling isn’t intrinsically valuable information. It also risks a number of unintended consequences. By visualizing this data, we give it more weight, amplify its reach, and increase the risk of harm. So if the results don’t tell an exceedingly important story, they may not be worth the risk. 
  • Trust vs truth. In a variety of tasks, quantile dot plots are effective ways to communicate uncertainty. In this study however, participants perceived  them as being the least trustworthy and least accurate of the four chart conditions. Other studies suggest similar tradeoffs between perceived trust and accurate judgements. Participants also had trust issues when they disagreed with the predictions shown in the charts, another effect common to election polling data. This suggests, of course, that it’s hard to tell people things they don’t want to hear. But it’s also a reminder that the most effective designs won’t necessarily be the most popular.  

Issue Polling

So far we’ve talked about ‘election polling,’ which asks “who’s gonna win this next election?” On the other hand, there’s ‘issue polling,’ which covers people’s attitudes toward specific topics like gun control policy, emerging technology, data privacy, and the social safety net, among many others. 

Issue polling reveals a powerful way that dataviz can influence our political attitudes.

This study started with a personal hunch. As a designer, I’m interested in ways that dataviz can help people understand each other. At the same time, as a researcher, one of my overarching theories is that, since people can be weird and judgy toward other people, charts showing social-identity comparisons have an intrinsic risk of triggering sketchy side effects (like victim blaming).

Issue polls illustrate this tension perfectly. They’re, of course, good-faith attempts to help us understand other people through their opinions. At the same time, there’s nothing pollsters love more than charts with social-identity splits, particularly between Democrats and Republicans, which implies potential political chaos.

Even though reputable pollsters like Pew and Gallup are out of the election forecasting game, they still play by the same rules of the attention economy. Pollsters, like most researchers, want their work to be seen and to be “part of the conversation.” They also know that political drama is a reliable path to exposure. Also, similar to U.S. news publishers, pollsters assume their credibility rests on being seen as non-partisan, so they’re similarly incentivized towards both-sidesing.

A blue party, a red party, and their polarized attitudes toward arming household pets with deadly lasers.

Given these incentives, issue polling charts in the U.S. often look like the example above, highlighting wide divisions between Democrats and Republicans. This fictional-ish chart, for example, shows polarization between Democrats, who tend to favor restrictions for arming household pets with assault lasers, and Republicans, who tend to oppose such policies.

Charts like these can cause chaos when you consider that our political attitudes are inherently social.

Our political judgments are driven by our people, not necessarily our ideals.

In one clever study, BYU researchers took advantage of Donald Trump’s ideological fluidity to show that his supporters’ policy opinions were also malleable. The researchers prompted study participants with a set of policy ideas (e.g. enforcing “penalties on women who obtain abortions”), as well as an indication of whether Trump himself had endorsed the particular policy. The trick: Because Trump has taken both sides of most issues, the researchers could vary which of Trump’s positions they showed to participants without raising any suspicions. For example, some participants saw that Trump favored abortion penalties while another cohort saw that he opposed these penalties. The study found that, regardless of the issue itself, self-identified conservatives rallied to Trump’s position. That is, the substance of the policy made very little difference; the Trump supporters adopted his stated positions regardless.

This illustrates a “social conformity” effect for political judgments, where people change their attitudes to match the perceived norms of their social groups. The BYU study picks on conservatives, but liberals shouldn’t get too confident: left-leaners are also susceptible to these effects.  

For our study, our main hunch was that these same social conformity effects could actually be triggered through a chart. This would imply that issue polling charts could actually influence the phenomenon they’re meant to represent. What’s more, if people conform their attitudes toward the polarized attitudes they see in issue polling charts, they might become more polarized themselves.  

Experiment: Can public opinion charts influence public opinion? Can they drive polarization?

This study was another collaboration with Georgia Tech’s Cindy Xiong Bearfield, a fellow collector of unexpected quirks in data psychology. We ran three full-scale experiments, testing nine different ways to visualize political attitudes, looking at six different topics of public policy, with a pool of thousands of research participants. The paper passed peer review and was accepted for publication through this year’s IEEE VIS conference. Details on the experiments are here, as well as links to the paper and our VIS talk. 

Five different ways to represent issue polling results, used as stimuli in my (Eli Holder) and Cindy Xiong’s 2023 paper: “Polarizing Political Polls: How Visualization Design Choices Can Shape Public Opinion and Increase Political Polarization.” The experiments tested partisan (left column) vs consensus (middle column) framings, as well as dot plots (top left + middle) vs jitter plots (bottom left + middle) to the control conditions.

The charts we tested fell into three buckets:

  • Partisan charts (left column): Charts showing attitudes split by political party, emphasizing either party’s typical attitudes toward a particular policy like estate taxes. When people saw these charts, we expected them to identify with their own political party.
  • Consensus charts (middle column): Charts showing the overall national attitude toward a party, emphasizing a national consensus which is, by definition, more middle ground. When people saw these charts, we expected them to identify as adults in the United States.
  • Control charts (right column): We used these for baseline comparisons. For example the true control just showed a vaguely related stock photo, which presumably shouldn’t affect their attitudes toward any political issues. 

We used dynamically generated versions of these charts so each participant saw a slightly different distribution of attitudes for each party and issue. This lets us compare the strength of attitudes shown on the chart to the strength of attitudes reported by participants. 

Findings

Polarizing Political Polls: Experiment 2 social conformity results: Visualized attitudes influenced reported attitudes. For all three treatment conditions, participants’ attitudes biased toward their visualized in-groups’ attitudes. These plots show participants’ bias in their reported attitudes toward various policies (y-axis: the mean difference between reported attitudes for the treatment, minus control) as a function of their in-group’s visualized attitude (x-axis, 0 = in-group opposes, 100 = in-group supports). Positive bias values indicate higher than expected support, negative indicate lower than expected support. The uncertainty ranges indicate 95% confidence intervals. Stars indicate significant differences at p<0.05.

First, we found that attitudes are contagious, even when they’re shown in charts. 

In the study, participants shifted their attitudes to match the attitudes we showed them in the issue polling charts. For the consensus chart, people shifted toward the national consensus. For the partisan charts, people shifted toward their political parties. As a comparison, the trendlines above would be completely flat if this effect didn’t exist. You can see an explainer for these results here. 

Polarizing Political Polls: Experiment 2 polarization results: The partisan range chart led to significantly more divergent polarization than the other three conditions. The horizontal ranges show the mean inter-party attitude distance (gap) between left- and right-leaning participants. The symmetric distributions on the ends show how wide the gaps could be. Plots are centered horizontally to avoid implying changes in absolute attitude positions for one particular party. Stars indicate significant differences-in-gaps from control based on non-overlapping CIs (* = 95%, *** = 99.9%).

The image above shows results for one of our experiments, where the gap for partisan-split opinion charts was 69% wider than the control ( 11.7 vs 19.8 points wide). These shifts were significant and meaningfully large. We also replicated these results in a followup experiment, where the gap was 68% wider than the control (18.9 vs 31.8 points). 

As participants’ attitudes moved toward the polarized attitudes shown in the partisan charts, the parties’ average attitudes diverged away from each other. After viewing the charts showing more polarized partisan attitudes, participants became more polarized themselves. 

Interpretations:

  • Showing that an idea is popular can make it more popular. This can be positive. For example, given a chart showing that more people believe in anthropogenic climate change, we’d expect the chart to nudge viewers toward the scientific consensus. On the other hand, charts showing increased vaccine hesitancy within specific political or social groups might actually further entrench those ideas for people who identify with those groups. 
  • Visualizing polarization can increase polarization. Showing divided support for an idea can lead to increased divisions. Even if polarization isn’t the root cause of political strife, charts showing partisan disagreement have an implicit social cost. That social cost might be acceptable if the underlying story is important enough, but we can’t assume charts like these are risk free. When deciding if polarization stories are newsworthy, these social costs need to be weighed against the social good of the information they provide.

The Bigger Picture

The stakes of the 2024 election couldn’t be higher. There are legitimate reasons to fear for the future of U.S. democracy. But polling results, as they’re presented in the news, shouldn’t be one of those reasons. For most of us, the only thing we can do is vote and volunteer. Polling results shouldn’t impact that. Whether our candidates are up or down, we still should vote and volunteer. 

If the deluge of data starts to feel overwhelming, try to give yourself some space. Take a few deep breaths. Count to ten. Remember that this content is meant to suck you in. It’s meant to rile you up. The business of polling, from its earliest days, is driven by making headlines and capturing audience attention. This is still true today. But you have no patriotic duty to follow along. It doesn’t make you a more informed citizen. It shouldn’t affect how you vote and volunteer. In fact, as we’ve seen, you’re probably better off avoiding this data entirely, as it can undermine your ability to make well-justified political judgments. 

News publishers should do their part and dial down the political-data-hype machine. Drawing attention to the horse-race and polarization narratives is, at best, ethically questionable. At worst, it undermines the democratic process, including the norms that make independent journalism possible in the first place (not to mention publishers’ long-term prospects as a business). As news consumers, we shouldn’t be afraid to call them out on these shenanigans and hold them to their lofty mission statements. The Washington Post reminds us that “democracy dies in darkness.” But we shouldn’t mistake election polls for political sunshine.

Does this mean no one should ever visualize political polling results ever again? Of course not. But it requires us to acknowledge that information has risks. It’s usually a force for good, but not always. And, as we’ve seen, political data may be more intrinsically risky than other topics. 

Acknowledging these risks doesn’t require accepting censorship. No one suggests turning off the sun because it causes sunburn. Information, like the sun, illuminates our world. But also like the sun, information can burn. 

So if dataviz heats up what it lights up, we have to be careful how we use it. We’re responsible for the stories we choose to tell and how we tell them. 

  • Learn the risks. Taking responsibility means educating ourselves on the downstream outcomes of data communication, beyond just the clarity of our charts and graphs. In particular when we’re visualizing data about groups of people, we need to consider how it affects the people being visualized. For example, as our issue-polling study showed, visualizing party attitudes can influence the people within those same parties. 
  • Manage the risks. Taking responsibility also means attempting to minimize the harm. If two different chart designs tell the same story, we should choose the one with fewer side effects. As the choropleth study showed, sometimes simple design changes can make a meaningful difference in minimizing harm, without sacrificing the overall story. The tradeoffs might not always be so clear cut, but that doesn’t mean we shouldn’t try.
  • Balance the risks. Finally, taking responsibility means consciously balancing risk and reward. If we acknowledge that information has risks, then we have to ask, what are we getting in return? If we’re going to risk viewers’ trust in democracy, are we highlighting something more profound than “people think Joe Biden is old” or “Democrats and Republicans disagree on even more stuff?” 

If we look past the horse-race and polarization tropes, dataviz can be a uniquely powerful tool for civic enlightenment. This starts with data stories that ask more ambitious questions. Instead of fixating on “Who’s going to win?” we should prioritize questions like “What happens if they do?” Or, as New York University’s Jay Rosen puts it, focus on “not the odds, but the stakes.”

Resources and next steps

This story was updated to clarify that more research is needed to understand how different uncertainty visualizations might influence voter turnout in a real election.

The post Divisive Dataviz: How Political Data Journalism Divides Our Democracy appeared first on Nightingale.

]]>
19638
We Cannot Give Up on the Data Viz Renegades! https://nightingaledvs.com/we-cannot-give-up-on-the-data-viz-renegades/ Thu, 14 Dec 2023 15:30:52 +0000 https://dvsnightingstg.wpenginepowered.com/?p=19272 Data viz renegades are often resistant to learn how to properly visually communicate the data that they work with. Here's how to help them.

The post We Cannot Give Up on the Data Viz Renegades! appeared first on Nightingale.

]]>
Author’s note: All names and project details in the story have been changed or censored.

You’ve probably encountered them before. They can whip up a bunch of tables and be confident that everyone will understand their analytics. 

“The data is all here, simple and obvious. Look, there are twenty filters right here! What do you mean, it’s not clear? Why do we need these columns and circles… You just don’t understand the content!”

In my experience working with business clients, these folks are usually finance professionals, representatives of exact sciences, engineers – they know their field of expertise well. But when it comes to conveying their ideas and insights to other departments or company leadership, their hands turn into paws. Data communication is not always their strong suit, yet many are convinced that they are doing just fine.

They are know-it-all neophytes.
They are overconfident outcasts.
They are stubborn luddites.

For purposes of this article, let’s call them data viz renegades. Throughout my career working alongside these renegades, I’ve noticed that they struggle to create clear and visually appealing reports so that the viewer doesn’t bleed from their eyes. And yet they often resist suggestions for improvement; they are either convinced that their work makes sense, or they have become complacent with their work over the years, excusing themselves because data visualization doesn’t interest them or isn’t—and will never be!—their strong suit.

A screenshot of a dashboard that has four different visualizations, two tables and a long scroll bar of filter options.
A scary looking dashboard.

So, for many years, they avoided learning and resisted improving. But now, the world has changed so much that more and more people are forced to work with data and then visualize the results of that work. In the past, only a few professions were learning this skill, but now everyone from bankers to factory workers are forced to deal with it.

My mission is to guide them toward the brighter side!

Is it possible to communicate with data viz renegades?… It’s as if they’ve closed themselves off and now deny the very possibility that something will work for them. So it’s easy to believe there’s no point in trying anymore… 

Should we give up?

Turning the Beast into the Beauty

No, giving up is not necessary at all! Here’s the approach I found for these renegades, and I recommend it to everyone dealing with such challenging learners:

The key message should be as follows:

“I don’t want to turn you into designers, but I want to teach you how to brief a designer! Because if you tell someone (who is far from your field of expertise) to do something with your raw data, the designer will likely create beautiful looking tables and charts that do not show the most important information. The designs may be something completely unpredictable, or only according to the designer’s understanding. So it’s important to take responsibility for this product and learn how to set the task correctly and get an acceptable result later.”

Surprisingly, they understood this approach because they are usually experienced individuals, often in high positions, who have dealt with task assignment and project acceptance during their careers. The initial fears of having to become a data viz expert are alleviated!

Now we can dive into the theory and basic concepts of data visualization! Just be prepared; progress will be slow. It’s a challenging path, and any teacher or trainer may feel drained and useless with such learners. Take comfort in knowing that this hard-earned skill they acquire will be retained and multiplied, bringing them much benefit. After all, it’s the development of our weaknesses that strengthens us — not polishing what we already know how to do.

The Story of Frank, a Wild, Wild Data Viz Renegade

Let me tell you more through the example of my student, Frank, who is 50 years old! He is lively, full of ideas, confident, a master in his field, and not afraid of challenges!

“I don’t need your dashboards! I’ve got everything under control!” he told me. Yet his initial attempts at dashboards are enough to scare any data viz specialist and evoke disgust towards dashboards in any manager!…

A badly designed dashboard with poorly chosen colors, unintelligible charts and random design and formatting, among other problems.
Frank’s very scary dashboard.

But I gathered enough patience to turn this data-viz monster into a beauty! I’ll explain in detail how I did it. And I recommend these steps to anyone dealing with difficult learners:

  • First, I asked him to show what he was doing, encouraging him to explain, as it helps material better sink in.
  • I tried to be compassionate and empathetic because Frank is as far from data viz as possible; it’s like something completely incomprehensible for him.
  • I constantly praised him, not sparing any compliments! Every tiny step, as small as building a simple chart, was valuable! He had a long string of failures behind him, and even the slightest difficulty could shake his self-confidence.
  • I lowered the bar of my expectations. A lot. I didn’t expect the successes I usually anticipate from an average student. I settled for less. It’s like a child taking their first steps; don’t expect them to salsa dance for you in a week.
  • I stocked up on patience – a massive bag of patience! – I was prepared to repeat the same thing many times, tried not to get annoyed, aimed to be more tolerant and kind.

The results would come, but it would take several times longer than with an ordinary student. I tried to prepare for that and adjust my mindset accordingly. After all, I understand that if someone can grasp data viz in two days of an intensive course, then a monthly marathon may not yield the same results.

And all the efforts will pay off.

And They Lived Happily Ever After

After months of regular sessions, buckets of my tears, and his sweat, Frank chuckled and confidently opened his latest dashboard version. I could barely contain a shout of joy.

His project might not have been perfect, but it was a decent dashboard, user-friendly, and capable of delivering value.

A much improved dashboard with a clear hierarchy, nicely formatted and colored visuals, and a pleasing layout.
A much improved dashboard.

And he did it himself – is there a greater reason for a teacher’s pride than the success of a student?

I’m sure a resourceful guy like Frank will find ways to use dashboards in his department and squeeze the maximum benefit from his new skills – he’s incredibly persistent!

And I’m happy that another person has mastered data visualization and will be able to work more efficiently with it. 

Hooray, another little star in the data-viz galaxy!

CategoriesCareer Design

The post We Cannot Give Up on the Data Viz Renegades! appeared first on Nightingale.

]]>
19272
Can Datavis Make Unpalatable Data More Enjoyable? https://nightingaledvs.com/can-datavis-make-unpalatable-data-more-enjoyable/ Tue, 21 Nov 2023 15:21:06 +0000 https://dvsnightingstg.wpenginepowered.com/?p=19132 I studied emotional reactions to climate-related data visuals. Certain visuals evoked positive feelings—despite the unsettling topic.

The post Can Datavis Make Unpalatable Data More Enjoyable? appeared first on Nightingale.

]]>
In the autumn of 2020, I came across a news article from The Guardian discussing audience beliefs and responses to graphs, charts, and maps. The article, titled “Facts v feelings: how to stop our emotions misleading us,” posits that if the title of the graph makes a claim about, for example, climate change, it attracts attention and engagement not because it is true or false, but because of the way people feel about the issue. 

I believe this to some extent, given my PhD research on audience responses to datavis about climate change, where I asked 34 study participants to provide emotional reactions to different types of climate-related data visuals. My work investigated elements of color, hand-drawings, animation, and interactivity using 13 different graphs, charts and maps. I sought to gather empirical evidence about whether and how these elements play into audiences’ engagement with the datavis. Based on my findings, I argue that design and visual style of datavis can be just as emotionally appealing as the subject matter or represented data itself.

The growing recognition of emotions in datavis

The increasing importance of datavis as a means of communicating information to the public has prompted visualisation designers to deliberately consider emotions in their work. For example, data journalist and illustrator Mona Chalabi suggested in 2016 that “there is no such thing as an emotionless data visualisation.” As she explained, datavis always has an emotive charge because emotions and feelings are often, consciously or not, embedded in the design decisions of the creators. On the other hand, Giorgia Lupi highlighted in 2017 datavis’ potential to evoke emotions and connect with people’s lives, transforming and simplifying quantitative data into something that can be both seen and felt.

Existing social science literature supports the idea that individuals respond emotionally, as well as rationally, to datavis. In the journal Sociology in 2017, U.K. researchers Helen Kennedy and Rosemary Hill identified emotional responses to various aspects of datavis, such as the visual style, the underlying data, subject matter, source or where the datavis is published, and skill levels for making sense of visualisation. As the authors point out, it was much easier for people to engage with datavis when they felt confident about their numeracy skills. Effectively, this means that people’s experiences and understanding of datavis depends on how much confidence they have. Other authors, including Catherine D’Ignazio, Rahul Bhargava, Jill Simpson, and Jonathan Gray, all explored the importance of emotions in datavis in their respective chapters within the book Data Visualization in Society.

Exploring emotive datavis features

Drawing on the work of Kennedy, Hill, and others, I investigated the emotional responses of Polish and British audiences to different datavis features, focusing on the following elements: color, hand-drawn elements, animation, and interactivity. These features were chosen based on their potential to evoke emotions in my research participants. While some relationships between these datavis features and the emotional responses to them seem logical and have been discussed by scholars and datavis designers (including Eric Margolis and Luc Pauwels in 2011; Lisa Charlotte Muth in 2018, Anna Feigenbaum and Aria Alamalhodaei 2020), there is still little empirical evidence about whether and how they shape and influence audiences’ engagements with datavis. I focused on climate change as a case study, investigating data produced or disseminated by six climate and environmental organisations from the UK and Poland including Carbon Brief UK, Climate Science Poland (Nauka o klimacie Polska), Greenpeace UK, Greenpeace Poland, WWF UK, and WWF Poland. 

The emotive power of color

Color emerged as the most emotive feature in climate change datavis. Participants often cited color as the initial aspect that attracted their attention, evoking positive emotions and a desire to explore the datavis further, even when they did not know what the image was about.

For example, one study participant, Aurora from Poland, described her experiences with the datavis Warming Stripes for Poland, which she encountered on Facebook: 

I remember flying somewhere, coming across this beautiful… scrolling on my phone and seeing bars in weird colors. I had no idea what was going on. I didn’t quite see the gradient. The colors passed through each other nicely. And I remember thinking it was so cool, pretty, and fascinating that I had to pause and look at it, not sure what I was looking at, thinking “What is this?” 

However, red, in particular, was often associated with warnings and evoked fear, reinforcing Muth’s argument that certain colors may intuitively carry certain metaphors in a given culture. 

Hand-drawn datavis: embracing intimacy and playfulness

Hand-drawn datavis or elements of datavis evoked strong positive emotional responses among my study participants. For example, Rachel based in the UK discussed the Mona Chalabi datavis of the annual increase of CO2 emission from 1850 until 2010 that she encountered on the Greenpeace UK Instagram account.

This graph, which is broken up into ten pieces/images, moved Rachel because of its visual form: “You could see there was like a story to tell, and I was interested to see how the graph would pan out, and it’s quite simple and the colors are quite playful. It doesn’t feel too serious, even though it’s about a serious topic.” 

The visual form of this and other hand-drawn datavis created a sense of intimacy and informality, making participants feel more connected to the information presented. This conforms with Simpson’s argument in her paper, Visualizing data: A lived experience regarding the impact of hand-drawn datavis on emotions. As the author argues, this kind of datavis can be more emotive as it is often presented “as imperfect and incomplete representations of a concept” and is “not associated with technical neutrality,” thus it appears more subjective. 

The hand-drawn datavises that I presented to my participants were often perceived as funny, playful, and relatable, which illustrate the point made by Feigenbaum and Alamalhodaei. Those authors distinguish between conventional datavis and graphic novels, such as hand-drawn datavis, which operate in different aesthetics and have the potential to humanise data by evoking positive associations and emotions.

Animation: evoking happiness and enlightenment

Participants expressed happiness and enlightenment when engaging with moving maps, charts, and graphs. In response to the question about particular elements of the datavis that made her feel emotional, Hawa from the UK answered that it is usually animation: 

Yeah, it makes me feel happier seeing them (…). Sometimes they can be moving – some charts, some graphs are moving charts or graphs so it makes me feel happier, more enlightened, triggers feelings of happiness because I am able to visually see and observe data that is important. Because it’s also about[…] climate change and I’m able to observe the data patterns over time which makes me feel quite powerful as well, because I’m in command of, you know, understanding what the data is trying to say to us. 

Similar to Hawa, many participants were drawn to animated datavis and felt pleasure derived from watching the animation that gradually satisfied their curiosity, as they explained later when asked about their emotions. 

Interactivity: fostering engagement and exploration

Interactivity, in particular, was related to excitement and joy, offering participants the opportunity to explore data independently and empowering them to command their understanding of the information. This aligns with Andy Kirk’s argument in his 2012 book, A Handbook for Data Driven Design, that datavis can perform distinct functions, such as explaining or inviting exploration. 

Some climate datavis such as Warming stripes not only provided an explanatory picture of the data but also facilitated visual exploration, which significantly increased the participants’ emotional engagement. The majority participants who responded to interactive datavis enjoyed the experience and ability to independently search for data on interactive datavis. It encouraged them to spend more time exploring the datavis and expanding their knowledge, not only about their close surroundings but also other countries with which they had no personal ties. When pressed further on why the interactive datavis Warming stripes for Poland was so exciting for her, Natalia, based in Poland, commented while exploring the datavis at the same time: “And because they’re so interactive, it’s amazing, I love it. I’d just stay here and look at different countries. I’m gonna look at other ones…” This suggests that most people like datavis that invite exploration and are willing to devote more attention to them.

The appeal of visual form despite unsettling data

As discussed in this article, one of the key findings that emerged from my research was the powerful emotional impact of the design and visual style of datavis. Importantly, despite encountering frightening and distressing data related to climate change, many participants reported experiencing positive emotions such as joy, satisfaction, and a sense of playfulness right from the outset of their interaction with different datavis features explored earlier. This intriguing contradiction highlights the potential of the visual form to evoke emotions that seemingly transcend the seriousness of the represented data, in this case, climate change statistics. However, it is important to note that other factors matter as well, such as the individual’s interest in and orientation to the issue at hand, which contribute significantly to this emotional response.

This finding holds significant implications, particularly when considering the potential consequences of negative emotions caused by climate change or related data. Research by Chartles Ogunbode and colleagues published in the journal Current Psychology in 2021 has shown that negative emotions linked to climate change can have adverse effects on people’s mental health, even leading to symptoms of insomnia. Therefore, the capacity of the visual form in datavis to elicit positive emotions becomes a noteworthy aspect in shaping individuals’ experiences with climate change data. In public discourse and media coverage, negative emotions tend to dominate discussions surrounding climate change, notes a 2013 study in the journal Global Environmental Change. Thus, the ability of datavis to evoke positive emotions like joy and satisfaction offers a fresh perspective on how we engage with challenging data.

The findings contribute substantially to ongoing discussions among datavis designers and scholars and prompt reflection on the consequences of using datavis to present challenging information. The findings suggest that datavis possesses the potential to transform unpalatable data, such as climate change statistics, into a more enjoyable experience for audiences. This raises an essential question about whether this is what datavis designers or experts want, and whether there are consequences. For example, would datavis mobilise people to act if they were all simply enjoyable? (I also attempted to delve into this question in my study, published in Significance, The Royal Statistical Society’s journal.)

However, further research in this area is warranted to fully understand the broader impact of datavis on emotional engagement and decision-making. By delving deeper into how datavis influences emotional responses and decision-making, we can gain valuable insights into how to best design datavis to effectively communicate complex and sensitive information.

The post Can Datavis Make Unpalatable Data More Enjoyable? appeared first on Nightingale.

]]>
19132
Through a Partisan Lens: How Politics Overrides Information https://nightingaledvs.com/political-psychology-in-data-viz/ Fri, 17 Nov 2023 15:48:36 +0000 https://dvsnightingstg.wpenginepowered.com/?p=19099 A political psychology primer for information designers.

The post Through a Partisan Lens: How Politics Overrides Information appeared first on Nightingale.

]]>
As information designers, we don’t typically think of our work as political. Our first loyalty is the data. We help viewers understand the world around them by wresting big, complex ideas out of the platonic ether and squeezing them into two or three dimensions. Normally, that just means solving the usual challenges like information architecture, dimensionality reduction, or weaving seemingly disparate facts into a cohesive narrative. 

But for some of the most important issues of our day, politics is a crucial lens through which people see the world, and this can impact how they see data. 

For example, consider an influential study from researchers at Yale, looking at how political alignment can create blind spots, even for the most analytically savvy people. Participants were presented with two different data stories: one on the efficacy of a skin cream for curing a rash, the other on the efficacy of gun control policies for stemming gun violence. The trick: Both stories were based on the exact same underlying data. So if participants read the data to say the skin cream was effective, they should rationally also conclude that the gun control policies were effective. But that’s not what happened. Instead, even for this highly numerate crowd, when participants saw the politically charged topic, their responses became polarized along party lines. Instead of objectively following the data, they couldn’t help but interpret it as evidence supporting their prior political positions. 

To design effectively, it’s important to understand not just how to construct a clear chart, but how people will actually interpret it. Since politics can be so distorting, it’s worth understanding how it shapes our interpretations. To do this, we’ll unpack the social and political psychology that drive our attitudes and beliefs about big political issues. 

Why should data designers care about political partisanship?

Effective dataviz means designing for more than just the data on the page. The context that viewers bring to a visualization can shape how they respond to it. In our politically charged culture, the topics that need the most explaining are also often the most political. Whether we like it or not, the information that we present will be consumed through a partisan lens. By understanding these biases, we can at least address them consciously. This can help in a few ways:

  • Adapting to a fact-free universe. Information design is premised on information being helpful. But sometimes information can’t help. Understanding cases like these, when information isn’t actually useful because attitudes and prior beliefs cloud reality, can help us better pick our battles, prioritize our visualization efforts, and adapt our storytelling. 
  • Persuading people with people. When reasoning fails, people look to others for guidance. For political issues, we’re heavily influenced by the people around us. Understanding how attitudes can spread through dataviz can help us produce more persuasive visualizations. 
  • Minimizing the harmful side effects of well-intended dataviz. Information can do more than just inform. As we’ll see, partisan issue polling charts can increase political polarization. Understanding these unexpected risks can help us mitigate them.

In a partisan environment, if our ideas and decisions aren’t strictly based on information, where do they come from? To understand this we’ll dive into social and political psychology. 

Political information psychology

Understanding social and political psychology can help clarify the boundaries of information’s influence. As we’ve already suggested, the facts aren’t always as persuasive as they should be. 

On the other hand, some types of information can be influential in ways that it shouldn’t be.

Social Influences

Some city dwellers looking up, reenacting Stanley Milgram’s famous “Drawing Power of Crowds” experiment. Image made with Midjourney.

If you look up, I look up.

It’s almost a cliche to say that humans are social creatures, but that doesn’t make it less true. We are comically suggestible. For example, in a famous social psychology experiment from the 1960s, psychologist Stanley Milgram sent his research team out onto the busy streets of New York City. He instructed his team to find a crowded part of town, stop in the middle of the sidewalk, and just look straight up at the sky

The busy New Yorkers not only noticed the researchers’ upward gaze, they stopped to join them. The passersby followed the researchers’ example, deciding to stop and find out what was so interesting. Other silly experiments show similar social conformity effects.

Why are MBAs conservative, and social scientists liberal?

Our social surroundings also influence our theories about how the world works, what we believe in, and what we value. For example, one 1996 study followed 91 students (34 business majors and 57 social science majors) throughout their college careers. The researchers wanted to understand how the students’ majors influenced their beliefs, particularly whether they thought poverty and unemployment were caused by personal failings (e.g. laziness) or larger systemic factors (e.g. inequality).

During their first year, students’ majors were uncorrelated with their beliefs. But by the third year, business school students disproportionately blamed poverty on the impoverished, while social science students pointed to external, systemic factors. The embedded cultural values of their coursework and their environments influenced their beliefs about this fundamental question of social justice.

Group Influences

An expressionistic interpretation of youthful tribalism, inspired by Henri Tajfel’s classic study. Image made with Midjourney.

Expressionism’s divisive influence on our impressionable youths

The silliness continues when considering the special privilege we give to people who are like us. The classic 1971 experiment highlighting tribalism showed how a group of adolescent boys, with common histories as classmates, were transformed into opposing factions when researchers assigned them to different taste groups based on their self-reported reactions to the works of Paul Klee or Wassily Kandinsky.  Despite the boys’ shared history, when they were given a small pile of cash to divide amongst their classmates, suddenly their prior friendships meant very little.

Instead the boys shifted their allocations dramatically toward their newfound brothers-in-art. This is, emphatically, not because the nuances of Kleesian vs Kandinskian expressionism were a hot topic for these high schoolers (behind the scenes the researchers assigned their groups arbitrarily). Instead, this demonstrates how even the most arbitrarily constructed social groups can produce in-group favoritism or outgroup discrimination. In fact, other experiments showed similar results when the groups were based on nothing more than a coin toss.

We like people who are like us, even if all we have in common is mutual disdain for some other group of people. 

Common ground beyond politics

These social group effects are presumably stronger for political groups, where party members actually have real things in common. Political psychology research suggests that we share some very primal psychological traits and needs with our fellow partisans. 

Political psychologists suggest that Conservatives place great value on feelings of security and certainty (while liberals are comfortable with uncertainty, ambiguity and risk). Conservatives also value uniformity in their social groups, while liberals value differentiating themselves. Perhaps because of these low-level psychological needs, members of today’s political parties have a lot in common with their fellow partisans (e.g. particularly for U.S. Republicans, where this also applies to their white, Christian, rural demographics). 

This is the basis for the “identity stacking” theory of polarization. This theory observes that more and more of our identity traits have lined up with our political identity. For example, if you know that someone is a Democrat, then you’ve also got better odds at guessing their views on climate change, which parts of the country they live in, how long they spent in school, how confident they feel about the economy, and whether or not they’re armed.

If we have more and more in common with the people in our political party, then we’d expect our fellow partisans to be particularly influential.

Political Attitude Formation

An AI-generated image of cats on the right wearing red and dogs on the left wearing blue. Between them is an image of pug in front of the Canadian flag. The group of dogs have hearts above their heads while the cats have frowny face icons above theirs.
Do different parties have different attitudes on Canadian imports? Image made with Midjourney.

One thing we all have in common: We’re busy. And we’re tired. (So so tired.) Even if we have the interest, very few people have the time or energy to dive into  the guts of tax policies, environmental regulations, or the extended implications of Citizens United. These are big, complex and multifaceted policies.  

So, for very practical reasons, people form their attitudes and judgements by listening to other people that they trust. In particular, we look to our political parties to tell us which policies we should support and which ones we should oppose.

Do we choose our parties based on policies, or our policies based on parties?

One interesting study from 2003 told participants about one of two proposed welfare programs, either a severely “stringent” program that offers far less support than existing policies, or a “generous” program that’s almost shockingly extensive compared to any U.S. welfare programs to date. 

From an ideological perspective, you’d expect conservatives to favor the former and dislike the latter, and liberals the opposite. However, researchers found that the content of the policy itself didn’t matter nearly as much as who endorsed it. For example, conservatives were willing to support either program as long as they were told it was supported by “95% of Republicans and 10% of Democrats.” 

Instead of choosing political parties that match our ideas, the process seemingly happens in reverse. We’re flexible on our policies as long as they’re supported by our people. 

How can attitudes spread through dataviz?

As we’ve seen, our attitudes are influenced by the people around us. This is especially true for political judgements that are difficult to learn experientially. It turns out that this same influence can happen through charts. For example, public opinion polling is a popular topic in political data journalism. What influence might we expect from charts like these?

A chart showing 67% of U.S. adults believe it should be illegal to manufacture or disetribute camouflage-pattern Crocs.
This chart shows 67% of Americans oppose camo-crocs.
This is fake data, but seems like a reasonable guess?

Consider the chart above. This shows pretend results from a hypothetical public opinion poll seeking Americans’ views on camo-Crocs. Specifically, it shows that 67% of people would support a policy to ban these abominations of footwear. Since this shows that the policy is generally popular, we might expect viewers who see this chart to identify with their fellow citizens and adjust their own attitudes to match the social norm shown in the chart. 

  • For people who were previously opposed to the policy, social psychology suggests that they’d increase their support. 
  • On the other hand, for people who were already very strong supporters, they might actually decrease their support since they see that others are more relatively ambivalent.

This highlights an important concept: By showing that an idea is popular, charts can make the idea more popular. And vice versa. 

This chart shows that 84% of Democrats and 45% of Republicans support camo-crocs.
This chart shows that Democrats strongly support and Republicans slightly oppose camo-crocs.
This is totally fake data. Republicans surely also agree that camo-Crocs are ridiculous.

This chart shakes things up a bit. Now it shows the results from our hypothetical opinion poll split by political party. We can see the camo-Crocs ban is very popular with Democrats and less popular with Republicans. These are effectively party endorsements, they’re just quantified and visualized. In the last section, we covered several experiments where highlighting a party’s endorsement of a policy changed viewers’ attitudes toward the policy, so we’d expect charts like these to have similar effects. 

  • If a moderate Democrat sees this chart, we’d expect them to increase their support. 
  • If a moderate Republican sees the chart, we’d expect them to decrease their support. 
  • If a bunch of moderate Democrats and Republicans all see this chart, we’d expect their attitudes to diverge away from each other. 

This example shows one of the potential consequences of social conformity from polling results. For partisan-split polling charts like these, we might expect peoples’ attitudes to become more polarized. To the extent that polarization is bad, it implies that charts like these have an inherent social cost. These charts may be informationally valuable — or at least mildly entertaining — but they’re not without risk.  

Our research shows that both of these effects are very real. Political polling charts can very much influence viewers’ political attitudes. When viewers see a chart showing that a policy is popular, that chart can make the policy more popular. When viewers see a chart showing that attitudes are polarized across party lines, that chart can make viewers more polarized.

Great, so what? What should designers do differently?

Alberto Cairo offers a useful maxim for ethical data journalism: “The purpose of journalism is to increase knowledge among the public while minimizing the side effects that making that knowledge available might have.” He summarizes the goal as: “Increasing understanding while minimizing harm.”

As we’ve seen, attitudes can spread from person to person, regardless of their actual content. This means that visualizing attitudes from survey results can have the unexpected side effect of promoting those attitudes. This can be risky in the context of political polarization, as visualizing polarized attitudes can increase polarization.

The social conformity effect can also be harmful in and of itself. 

For example, imagine an interest group called “Dirty Handed Doctors of America.” Let’s say they survey their unhygienic-but-medically-credentialed members. Their main finding:“94% of MDs in our esteemed organization strongly agree we should stop washing our hands before treating patients.” That finding may in fact be totally accurate. Their opinion is wrong, but it could be true that 94% of them support it. Our research suggests that visualizing extreme attitudes like these might help them spread further (like the germs on their filthy, filthy hands). So even though their survey results might be technically true, publicizing them may reduce support for hand-washing among other sympathetic physicians. 

This means that we can’t just assume, by default, that visualizing polling results are a civic good, simply because they’re accurate and informative. As Cairo suggests, we have a stronger duty-of-care than simply conveying technically accurate information. Since visualizing attitudes comes with an implied risk, we need to consciously weigh those risks versus whatever benefits we expect from publicizing them.

What should we do?

Before publishing polling results, especially for political issues, designers, data journalists, and editors should ask: “If more people agreed with the attitudes in the chart, would that be good for the world?”

To be sure, this won’t always be an easy question to answer. Attitudes supporting “physicians shouldn’t wash their hands” are obviously silly. But political topics typically cover grayer areas, involving subjective values and morals. Deciding to publish political attitudes, then, is a subjective judgment call. The question above can only help you frame that decision. By at least attempting to answer the question, you’re forced to weigh the risks of spreading potentially silly ideas versus the benefits of sharing the information. Even if you decide that the information is worth the risk, you’ve at least made the judgment call consciously and thoughtfully, rather than taking its value for granted.

Takeaways

Viewers’ politics can influence how they see the world. This, in turn, influences how they take in new information. This has a few important implications for anyone visualizing social issues or otherwise politically-charged information.

  • Facts aren’t always as influential as they should be. If all of our attitudes and decisions were purely rational and information-based, the silly effects we highlight above wouldn’t exist. But in the real world, judgments about identical datasets can flip based on a person’s politics. And attitudes toward public policies are more influenced by endorsements than the policies themselves. Information is still influential, but the surrounding social context should be considered as well.
  • Survey results can be influential in ways they shouldn’t be. Information about others’ political attitudes (e.g. polling results) can unreasonably influence our own political attitudes. This influence can happen through simple partisan cues, like whether or not a party supports or opposes a policy, or by visualizing survey results. This also means that popular political data-journalism, such as election forecasts or issue polling, can have some toxic side effects like increased political polarization.  
  • Before visualizing political attitudes, weigh the risks versus benefits. Information designers should take these risks of attitude contagion into account when deciding whether to visualize and how to frame polling results. We can’t always objectively answer the guiding question (“If more people agreed with the attitudes in the chart, would that be good for the world?”) but by raising the question in the first place we can ensure judgment calls like these are made consciously and thoughtfully.

Dive deeper!

This writeup is meant as a primer for 3iap’s latest peer-reviewed visualization research, which we presented at this year’s IEEE VIS conference, in collaboration with Georgia Tech’s Cindy Xiong-Bearfield. If you’d like to better understand the pathway from polling charts to polarization — or see our 9 minute talk on the politics of cats and dogs — please check out our deep dive on the research project.

Dive Deeper: Polarizing Political Polls Design Research Project.

The post Through a Partisan Lens: How Politics Overrides Information appeared first on Nightingale.

]]>
19099
Step 7 in the Data Exploration Journey: Spin-Off Projects https://nightingaledvs.com/data-exploration-spin-off-projects/ Tue, 24 Oct 2023 16:01:09 +0000 https://dvsnightingstg.wpenginepowered.com/?p=18896 With large projects, it's common to pursue spin-off ideas for the material that doesn't fit into the core project. Here are two examples.

The post Step 7 in the Data Exploration Journey: Spin-Off Projects appeared first on Nightingale.

]]>
This article is part 8 in a series on data exploration, and the common struggles that we all face when trying to learn something new. A list of previous entries can be found at the end of the article. I began this series while serving as the Director of Education for the Data Visualization Society, because so many people were asking to hear more about the process of data exploration and analysis. What began as an exploratory project on the “State of the Industry Survey” data grew into a 1.5-year Career Portraits project that produced the 2023 “Career Paths in Data Visualization” report (DVS member login required). This series illustrates how I approach a new project, and the fact that no “expert” is immune from the challenges and setbacks of learning. Let’s see where this journey takes us!

In the last article, Jenn Schilling and I refocused my initial data exploration to frame a broader Career Portraits project based on the DVS “State of the Industry Survey” data. We trimmed our scope aggressively to reflect the time and resources that we had on hand, and re-envisioned some core parts of the project. At the end of that focus-and-consolidate phase, we had a clear, tight focus for the project. 

Successfully navigating the focus phase has many advantages. First, it allows you to put your energy into the most important things. It also creates the seeds for lots of new ideas and projects that can spin off of the core work; very often, almost everything you cut can be considered a future upgrade or a new project in its own right. 

If your time and resources are fixed, focusing a project can mean postponing parts that you’re excited about. (I suspect that this is why most people struggle to make the cuts.) In many cases, this also creates an opportunity to share the work or to structure it in new ways. The end of the focus phase is the perfect time to start looking for collaborations that can help to move your project ahead. The guidelines from our previous article on collaboration still apply! In the case of spin-off projects, it’s particularly important to remember that a collaboration adds back to the scope that you just reduced. You need to account for that effort, and should never use spin-offs simply to avoid making cuts.

When working on initiative-level collaboration for something as large as the Career Portraits, it’s important to: 

  • Set clear boundaries between projects. Everyone needs to know what they’re working on, who’s doing what, how it’s different from what others are doing, and what’s needed for it all to come together. Mixed messages lead to missed goals, duplicated work and frustration.
  • Look for common goals and mutual wins. The best collaborators have an interest that is slightly outside of your scope, but whose needs align with yours. We’d worked on aspects of both projects below during our initial Career Portraits work, but pursuing them as collaborations generated significant contributions that supported and extended the core work beyond what we could have done alone. 
  • Work to align timelines in advance. In some cases, collaboration like this creates dependencies. It’s important to be clear about when you need things done, and to be willing to flex if the schedule doesn’t work out as you hoped. 
  • Be realistic about what you can take on. Collaborations take a lot of work and require support to succeed. Collaboration is not delegation, and you need to be available to fully participate in any project that you spin off. If you can’t realistically support it from start to finish, don’t start.  

As Jenn and I started the heads-down build phase for the core Career Portraits work, I was able to identify and spin off two projects in collaboration with other teams: 

Collaboration #1: Career Interviews Series 

The first project that we spun off was a series of career interviews with people working in data viz. I knew from the beginning that I wanted to include qualitative stories alongside the quantitative data for the Career Portraits, to give the data a more human face and to illustrate how much variation there is within even a few of the individual “data points” (a.k.a. responses) from the survey. It’s always important to check your insights against reality whenever you are building a data story, and connecting with people from the community was one way to help us do that. While Jenn began re-working the core data analysis in December 2021, I started a series of research interviews with people working in the field. I put out a call for participants in the newsletter, on Slack and in a couple of articles, and we got a core set of interviews scheduled in January to start the research. 

YouTube video cover with a title and photo of Elijah Meeks
Caption: Cover page for the Careers in Data Visualization interview series, hosted by Elijah Meeks

Around March of 2022, Amanda Makulec started conversations with Elijah Meeks about hosting a series of career conversations to spotlight paths into data viz. This aligned well with the work that I was already doing for the Career Portraits project, so we joined efforts and I worked with Elijah to brainstorm some questions and visualizations to inform his series. Amanda organized the calls to be released over the summer, and Josephine Dru and I compiled transcripts for each one as they came out. My early research calls had given us a good sense of where we wanted the project to go, so I was also able to compile a pre-survey for all participants to take. The questions were similar to the ones we were working on in the Career Portraits project, but they went into more detail and depth on a few points that we wanted to learn more about. Because we were working in a short format with a patient and supportive audience, we were able to ask much more focused (and sometimes repetitive) questions than we would publish in the general survey. As the results came in, I visualized the data and wrote a series of summary profiles for each interview, creating the “Career Profiles” section of the final report.

Adding profile interviews to the newly-reduced scope of the “Career Paths in Data Visualization” report more than doubled the work and required pushing our initial deadline back by a couple of months, but it gave the final project a much richer view into career paths in data viz. If we hadn’t cut deeply during the focus phase for the core project, we wouldn’t have had the time budget to take advantage of this opportunity when it arose. Collaborating on the profile interviews made both projects stronger, and made the lift much smaller than if the Career Portraits team had tried to do it all alone. 

Collaboration #2: Automated Tagging of Free-Text Job Titles

The second project we spun off was much larger in scope. Jenn had completed a quick clustering analysis when she first joined the Career Portraits project to look for trends in the kinds of tools needed across different careers. By looking at where specific titles intersected or crossed over between career areas, we thought we’d be able to pull out a lot more detail about specific roles. Our early results were very intriguing, but we quickly realized that this analysis was a project all on its own, and it wasn’t realistic to pursue it as part of the core Career Portraits work. Quite reluctantly, we put the clustering work aside so that Jenn and I could focus on re-running her initial core careers analysis on the updated 2021 dataset, so that we were using data from the most recent survey collection year. 

Graph diagram showing 7 color-coded clusters with connected nodes.
Clustering analysis from Lukas’ thesis project

In April of 2022, a graduate student named Lukas Geisseler posted a query in the DVS Slack looking for organizational partners for his master’s thesis in Applied Information and Data Science at Lucerne University in Switzerland. His program required a project with a well-defined topic that would contribute to an organization. I reached out in response, and we started discussing whether he might be able to use the survey dataset to support his thesis project. The clustering analysis was where our discussions began, but we soon settled on a much larger task that would be a fantastic addition to our dataset and a better candidate for the depth and scope required of a master’s thesis. To understand the importance of his contribution, it’s helpful to know a little bit more about the limitations of the survey dataset.

There are two columns related to careers in the survey dataset. The first one is a fixed career category that users select from a dropdown (analyst, designer, etc.), and the second is a free-text entry field for job title, where people input their actual job title. When we first started the Career Portraits project, I manually (and inconsistently) tagged a couple thousand survey responses to compare the fixed bins to the job titles, and found many interesting threads to pursue. My exploration was fast and dirty, but it gave me a better sense of the dataset, and it pointed out some important aspects of the data that helped us to contextualize the results for our core Career Portraits work. 

In the fixed career category question, people often categorize themselves differently than you’d expect based on their job title: someone categorizes themselves as an analyst in the fixed careers list but their title is data visualization developer, or an engineer has the title of data visualization UI designer. This in itself is a fascinating statement about job searching and careers in general, but it’s particularly relevant in a career where roles tend to be highly variable between companies, and are often conglomerates of multiple roles and responsibilities. 

The definition of the fixed categories themselves also posed some challenges. First, it wasn’t clear when someone should switch from calling themselves an “analyst” to “leadership”: if your title is director of analysis, where do you put yourself? Some people listed themselves as “leadership” when they got to a team lead or director level position, and some were still listing themselves as “analysts” when their title stated VP or CEO. Second, it could be hard to understand the definition of some of the buckets: Do data scientists belong in the analyst, developer, or scientist bucket? Representatives show up in all three! We knew that these response variations would muddy our results (for instance, including a VP’s salary will almost certainly skew the median salary for a career), but there wasn’t a good way for us to consistently and efficiently tag the titles with the resources and the project team that we had. Using the free-text question could have helped to clarify some of these more complex cases, but we reluctantly chose to rely only on the fixed buckets for our project because they were the simplest to use and the most direct reflection of the user’s input.

The free-text job title question also contains a lot of implicit information about job seniority, role progression, and experience. It would be very interesting to compare job seniority level by title (junior analyst, senior analyst, director of analysis, etc.) with years of experience in the base survey dataset. For example, it would be interesting to compare the range of years of professional experience typical for a junior vs a senior role. Unfortunately, we couldn’t easily extract the job level information from the free-text entries without more advanced methods. In order to focus on our core deliverable, we made several painful cuts and put these more nuanced analyses aside, hoping to come back to them another day. 

We didn’t know it at the time, but we didn’t have long to wait. Auto-tagging free text responses was exactly the kind of problem that Lukas was interested in studying, and he developed a pipeline to analyze the job titles as part of his thesis project. The full pipeline includes creation of neural networks, machine learning to train the algorithm, and graph representation to help interpret and quality-check the dataset.

Once built, this pipeline automated the analysis of the job titles data, removing a tedious and manual task that is hard to do consistently over large datasets. However, the initial process of training the algorithm required that Lukas manually validate his machine learning tags. He published an early version of his results in the DVS Slack with a call for participation, and some dedicated community members helped him to error-check and validate the initial coding results, lightening the load on one of the more time-consuming parts of his thesis work. This validation process also helped Lukas to fine tune his approach, and made the final outputs more robust. Once the initial text tagging was validated, he carried out a clustering analysis to look at how jobs were distributed within the results, and used the resulting graph to detect patterns and intriguing details in the dataset.

Lukas’ thesis was submitted in June of 2023. In all, he contributed more than 500 hours of advanced data science and programming time to the DVS. This is far more than we’d expect from a typical volunteer, but he was able to benefit from our dataset, experience and use case as a core part of his thesis, and we certainly benefited from the results of this in-depth academic project. Lukas has already tagged the 2020 and 2021 datasets, and the outputs of his model can be used to tag future datasets as they are collected. Leveraging the algorithm’s consistency and speed will also allow us to enrich datasets from previous years. This enrichment might support more advanced longitudinal analyses in future, if we can overcome the complexities of changing definitions and survey context from year to year. Lukas’ analysis also created a more robust version of the initial clustering that Jenn had worked on in the core project. Producing similar results with a completely different method added confidence to some trends, and highlighted some potential differences or artifacts in others. Comparing the initial results from the two analyses helped us to identify many interesting stories within the dataset, and to outline potential next steps to pursue. 

What makes a spin off project work?

There were several ingredients working together that made these spin-off projects successful. Here are some things to look for when evaluating a side project:

Opportunities to provide value to both sides

For Elijah’s interview series, we were able to help brainstorm questions and prep materials, and we folded the results of the interviews back into our work to give them additional impact. When the portraits were published, the interviewees got a document that they can share to highlight their work. We got support from Amanda in scheduling and running the calls, the benefit of Elijah’s standing in the community and his perspective on what’s interesting to talk about, additional material to support the Career Portraits effort, and a collection of generous insights from the people that he interviewed. 

In Lukas’ thesis project, we were able to provide an interesting dataset and a tangible problem, as well as some basic explorations to accelerate the start of his work. It’s nice to work on a thesis project that will continue to be valuable after the work itself is complete, and a tangible outcome can help to make an academic thesis more approachable to potential employers. We got almost a year of focused, highly specialized work that enriched our dataset and will help us to continue creating value for the DVS community. 

Clear separation between projects

While both projects contributed directly to the Career Portraits work, neither was required for it to succeed. We wanted the projects to feed and encourage one another, but not to introduce unnecessary pressure or risks. The projects were structured in a way that allowed others to take ownership and carry an initiative forward without a lot of input from the core Career Portraits team, but we also set up regular communication between the projects to learn from one another, seek additional opportunities for alignment between projects, and to highlight the contribution and impact of each team. 

A critical aspect for planning was to remove or reduce timeline dependencies between projects, so that if one project fell behind or changed direction it wouldn’t break the others. We did need to complete the interview series before the report could be published, but the profiles were written and visualized independently from the analysis work that I was doing with Jenn. I handed off a completed document at the end of my board tenure in January of 2023, ensuring that the core project reached completion before it changed hands. We didn’t include Lukas’ results in the core document because we knew that his project wouldn’t finish until at least six months later, but his results will help to extend and inform the clustering analyses that we’d started before he joined. Having seen his preliminary results, his work could even become the seed for Career Portraits V2, if the DVS decides to pursue that project in future.

Honor each contribution 

Each collaboration is a significant commitment of effort and time. It’s important to honor each person’s contribution to the bigger effort, and to take the time to make sure that work is rewarded. There is a difference between collaboration (where both sides contribute) and delegation (where one side assigns work to someone else and expects an outcome). I find that people often confuse the two. To be a good collaborator is to commit to doing work to raise your collaborators up, even if it’s outside of the focus of your core project. For this reason, you need to be very careful about assessing your own ability to support the work involved in a side collaboration. The focus stage of a project is necessary to evaluate whether you have the time budget to do that successfully. 

In the profile interviews series, our commitment took the shape of additional preparation for the interviews and nearly doubling the scope to the Career Portrait deliverable. For the thesis work, I chose to continue working with Lukas past my tenure as DVS Education Director, to make sure that he had the support he needed to see his project through to the end of his thesis work. Collaboration is a giving economy, and it’s important that you commit to your collaborators as deeply as you ask them to contribute to you. 

In the end, both of these collaborations were highly successful, and I believe that they created significant value to the organization. Both leveraged the early groundwork that Jenn and I had done, but each project took things to a completely different level and contributed far beyond what we were able to do on our own. Because we were disciplined about the focus phase of our project, we were able to identify and act on these opportunities when they came up, allowing us to collaborate in ways that we couldn’t have anticipated at the time we made the cuts. 

You won’t always be able to spin projects off immediately with these kinds of results, but a disciplined, focused approach to project management helps to ensure that you’ll be ready to jump on opportunities when they come. Returning later to the things you cut means that you’ll always have another project ready if your initial inspiration runs dry, or your project hits a wall. 

The core Career Portraits project was published this summer in the DVS member space (member login required). We’ll continue discussing the actual project build in the next article!

Previous articles in this series:

Embrace the Challenge to Beat Imposter Syndrome
Step 1 in the Data Exploration Journey: Getting to Know Your Data
Step 2 in the Data Exploration Journey: Going Deeper into the Analysis
Step 3 in the Data Exploration Journey: Productive Tangents
Step 4 in the Data Exploration Journey: Knowing When to Stop
Step 5 in the Data Exploration Journey: Collaborate to Accelerate 
Step 6 in the Data Exploration Journey: Cut to Realistic Scope

Related links:

Early Sketches for Career Portraits in Data Visualization, by Jenn Schilling
DVS Careers in Data Visualization, YouTube Playlist for interview series by Amanda Makulec and Elijah Meeks
Career Portraits project (DVS Member space login required)

CategoriesHow To

The post Step 7 in the Data Exploration Journey: Spin-Off Projects appeared first on Nightingale.

]]>
18896
Making Dashboards Optimal for Human Brain Processing https://nightingaledvs.com/dashboards-human-brain-processing/ Thu, 21 Sep 2023 15:52:00 +0000 https://dvsnightingstg.wpenginepowered.com/?p=18669 How the science behind brain processing and active memory can help guide our dashboard and other data visual designs.

The post Making Dashboards Optimal for Human Brain Processing appeared first on Nightingale.

]]>
Have you ever spent days poring over charts and diagrams only to feel no closer to understanding the problem? You’re not alone. Consider the findings from a recent Oracle study titled “How Data Overload Creates Decision Distress,” which surveyed 14,000 people, including employees and business leaders across 17 countries. A staggering 70% of respondents admitted to giving up on decisions because of overwhelming data. The report also underscored the critical importance of decision intelligence for business leaders. A resounding 93% of business leaders believed that having the right decision intelligence can make or break an organization’s success.

Image with text: 'The number of decisions we make every day is multiplying - 74% of people say the number of decisions they make every day has increased 10x over the past three years. In theory, the data should help, but in reality, it is having the opposite effect - 97% of people want help from data, but 86% say the volume of data is making decisions in their personal and professional lives much more complicated. The decision dilemma is negatively impacting our personal health and well-being. 85% - of people say the inability to make decisions is having a negative impact on their quality of life. At the same time, we have more data at our fingertips while making those decisions than ever before - 78% of people believe they are getting bombarded with more data from more sources than ever before. Our complex relationship with data and decision-making is creating a dilemma. 70% - of people admit they have given up on making a decision because the data was too overwhelming.
Summary of relevant results of the 2023 Oracle Study “How Data Overload Creates Decision Distress.”

So, can we make the data the guiding light it has to be without shining too brightly?  Let’s closely analyze how data is transformed into visuals, which is where I believe the real challenge arises. Unlike earlier automated stages, this step depends on human interpretation to turn pixels into actionable knowledge. It’s where noise peaks and misinterpretation risks are highest, from superfluous graphics and irrelevant metrics to data integrity issues. We also need to keep in mind that from the perspective of informational theory, which studies how to transmit signals efficiently, a dashboard is a communication channel.

Channels, in our case, dashboards, have cognitive limits, however. This limit is believed to be around 120 bits per second, the brain’s max speed of conscious processing of information. Although our brain can process up to 11 million bits, we can only consciously process 120 bits by design. The efficiency of our brains allows us to filter and compress millions of data points to only a fraction of the most critical information, which we can process immediately and consciously.

Additionally, our attention span is measured in seconds, and with the demands of multitasking, meetings, random work chat messages, and calls, the time available to process incoming signals from a digital canvas is decreasing. While our attention span used to be about 21 seconds, now it is closer to eight seconds.

As we process the incoming signals, spending our precious seconds and conscious effort, we can hold only 5-7 information items simultaneously in active memory. So even when there is time and appropriate attention to process the signals from data, these signals must come in small doses. 

Image illustrating the speed of consciously processing information in bits per second: Bits per second (represented visually as grains) for each category described below: Each 20-25 bits are depicted as one grain. Interpreting familiar visual cues such as facial expressions: 20 bits with 1 grain drawn. Listening to one person speaking: 50 bits with 2 grains drawn. Two people speaking: 100 bits with 4 grains drawn. A small data visual: 100 bits with 4 grains drawn
Visualized by the author based on information on article in Fast Company: “Why It’s So Hard to Pay Attention, Explained by Science

Additionally, noise is amplified when there is a significant mismatch between the sender and user in the domain expertise. Managers are typically subject matter experts while analysts are not. If this mismatch is too great, the dashboard can become inherently noisy, rendering valuable information trapped and ineffective. 

Furthermore, the user often must improve their data literacy when it comes to dashboards, learning to navigate dashboards effectively, including utilizing filters and interactive features. 

Turning data into effective signals

Dashboard-ready analytics and metrics

Image displaying simplified formulas for relative variables, categorized by department and industry: By Department: Marketing KPIs: Acquisition cost = Marketing spend / Customers gained Finance KPIs: Profit margin = Profit / Revenue Sales KPIs: Net sales % = Net sales / Sales Production KPIs: % Equipment utilization = Uptime hours / Total available time By Industry: FMCG - Trade Marketing: Volume on Deal % = Promo sales / Total sales Oil and Gas - Valuation: Reserve to production = Reserves / Production Healthcare - Operations: Occupancy rate = Number of beds occupied / Total number of beds Banking - Risk: Risk limit utilization = Current exposure / Risk limit

First,  we need to maximize the informational value per visual. To do that, metrics must have three qualities:

1. Be relative or an index compared to a benchmark
2. Be filtered to the dashboard use case 
3. Be set against a comparable benchmark (i.e. plan, budget, and limit). 

Be relative or an index compared to a benchmark

Our understanding of crafting relative metrics for data-driven decisions has advanced significantly in recent decades. Techniques like the balanced scorecard, unit economics, ratios, and contribution analysis share a common thread: they all reveal how one metric relates to another. Consequently, these metrics offer threefold or more informational value for the same user attention time. For instance, a visual showing margin percent proves more insightful to a decision-maker than standalone profit figures. Across various industries, whether ratios, margins, or unit economics, a common pattern emerges—a numerator and a denominator. As these index KPIs evolve, they accumulate even richer insights by tapping into the values of underlying KPIs and their connections to benchmarks.

Be filtered to the dashboard use case

It is not enough to visualize just one base formula. It is better when a range of derivative measures is present for selection; this allows users to choose different aggregation levels and periods without interrupting the analysis flow. Five to 10 derivative formulas should support a single metric. Here, analytics becomes an instrument that maximizes user experience.

Be set against a comparable benchmark

Expected KPI values are essential to the informational value of the whole visual. They provide context even to the person unfamiliar with the subject matter. Benchmarks, plans, or limits usually come from manual analysis done by the managers. In these simple digits, a wealth of research is hidden. A good dashboard must use this wealth to enrich all data with a straightforward method–comparing the current and planned values.   

Resulting in last-mile analytics

I call metrics that have these three characteristics last-mile metrics. These metrics are delivered and handed to the decision-maker the same way a postal package is handed to the recipient on the last mile of delivery. Akin to the product’s journey, from the warehouse shelf to the back of the truck to the customer’s doorstep, data insight traveled from the source database to the data warehouse and finally to the dashboard. This last leg of the delivery process is the most critical both in the supply chain and in data analytics.   

When metrics have these three qualities, the last mile handover is now likely to be successful. 

Data modeling for best filtering and drill-downs

Power BI dashboard image displaying various data visualizations and filters: At the top, there are filter options for segment, priority, ship mode, category, and date. In the main section: A white section with 'Sales' in a larger font at the left margin. Adjacent to 'Sales,' there are the following secondary metrics: Profit (in absolutes) Quantity Average Price Average Check These secondary metrics are presented with bar graphs or line graphs. Another bar diagram shows the distribution of revenue % between three segments. A heatmap table on the left displays the distribution of profit by ship mode, and on the top, it shows profit distribution by segment. The table uses darker background colors to represent higher values, creating a heatmap effect. Another distribution table shows profits per region and the three segments, also with a heatmap color scheme. A final table on the right displays category, product name, sales, sales by month (as sparklines), profit, profit by month, quantity, average price, and average check
Example of dashboard made by the author for a speaking event based on Global Superstore dataset, available on Kaggle.

Second, dimension filtering options allow the user to receive more signals.  For this, a data model must be in place. Data models dictate the table structures and their relationships. The most valuable data model schema, in my experience, is the star schema. The star schema uses a central fact table surrounded and supported by a range of dimension tables. Usually, each business process should have its own star. As complexity increases, stars can become constellations.

This approach takes more time to prepare and model than just visualizing a pre-filtered queried table. Still, because it gives the user greater control of how and when to drill down, it alleviates data overload significantly.  With a digital canvas of visuals built on a star schema model, users can slice and dice the same absolute and relationship metrics across all filters.

As the visuals change, their movement captures the brain’s attention without conscious effort.
This changing canvas also allows us to mitigate the cognitive load problem when we show too many visuals simultaneously. Hence, the user can interact with the channel and request more signs once ready (by choosing filters or pressing buttons). With each interaction, as the canvas becomes a familiar setting of graphs and charts, the variety and volume of visuals can increase gradually without causing a data overload.

Power BI dashboard gif image displaying changing data as various data visualizations and filters are pressed: At the top, there are filter options for segment, priority, ship mode, category, and date. These are chosen and the numbers and graphs change in the main section In the main section: A white section with 'Sales' in a larger font at the left margin. Adjacent to 'Sales,' there are the following secondary metrics: Profit (in absolutes) Quantity Average Price Average Check These secondary metrics are presented with bar graphs or line graphs. Another bar diagram shows the distribution of revenue % between three segments. A heatmap table on the left displays the distribution of profit by ship mode, and on the top, it shows profit distribution by segment. The table uses darker background colors to represent higher values, creating a heatmap effect. Another distribution table shows profits per region and the three segments, also with a heatmap color scheme. A final table on the right displays category, product name, sales, sales by month (as sparklines), profit, profit by month, quantity, average price, and average check
Created by the author — Dimensions maximize the analytical value of data to users while reducing cognitive load.

Data visualizations maximize the use of the user’s attention.

Third, data visualization principles must play two critical functions per informational theory. First, the visual hierarchy should control which signals (or visuals) will be noticed first and last. 

The pre-attentive attributes of color and size best create a visual hierarchy. Research studies have shown that the brain can process visual information, including color, in as little as 13 milliseconds. Visualizations that are larger and with greater color contrast will get the first milliseconds of attention. The visual system will scan for the next largest and contrasting object as it takes in information. More important signals are placed from the top left to bottom right (when users read from left to right).

Why should there be a difference in the order in which we need the information processed? As we know, the capacity of short-term memory is limited. Hence, the signal sender should not show too many visuals with equal importance simultaneously.

The second function of data visualization principles is to minimize noise. From the perspective of informational theory, noise is unwanted variations or disturbances that can corrupt or interfere with the accurate transmission or reception of information. In our communication system, noise has many ways to introduce itself:

  1. Visual noise: Visual clutter is elements that do not add any informational value or distract from the intended message. It could be excessive use of colors, icons, or graphical elements that do not contribute to conveying the relevant message.
  2. Data Noise: Data noise is inaccuracies, inconsistencies, or irrelevant data points that can confuse or mislead the user. The sender can introduce this noise even with error-free data if he lacks the domain knowledge to design visuals as signals.
  3. Interface Noise: Interface noise refers to design or usability issues that hinder the user’s ability to interact with the canvas effectively. This could include confusing layouts, unclear labels, or intuitive navigation, making it difficult for the user to access and interpret the messages.

One of the effective ways to decrease clutter is to use Gestalt principles. There are four principles: Proximity, Similarity, Continuity, and Closure. Their use makes it easier to delete unnecessary lines, group similar items without using pixels to highlight the grouping, and reduce compleх shapes to their essential forms. Less unnecessary clutter means less noise without compromising the intended message. Using Gestalt principles to minimize noise, we use the existing universal encoding mechanism of the mind to eliminate unnecessary pixels, increasing the pixel-to-data ratio.

UX/UI design principles to customize the design to the use cases

"Power BI dashboard image featuring two maps: Top Map: A map of regions, with the West Region highlighted through a click. An interactive selection for regions is available. Bottom Map: A map of cities, displaying cities in the USA and UK. Cities with low-profit margins are represented as red dots, while cities with acceptable margins are shown as blue dots. The relative size of each dot corresponds to the profit in absolute terms. Left Menu: A pop-up menu is located on the left side. It has options for analysis: Territories Analysis (selected) Product Analysis (link to another dashboard) Customer Analysis (link to another dashboard)
Example of dashboard made by the author for a speaking event based on Global Superstore dataset, available on Kaggle.

One dashboard cannot send all of the messages, and trying to put all the information on a single canvas can overwhelm the user. Collecting interrelated dashboards and reports can solve this problem and provide insights in manageable doses. In essence, by creating a system of dashboards, we are increasing the channel’s capacity to transmit signals to each user. Each canvas communicates fewer signals, but the user can request more signals once his mind processes the first batch. The user can request more signals through interactive elements such as page transitions, buttons, pop-ups, and drill-downs.

We also can tailor each canvas to the use case of the receiver. The channels’ capacity depends on the use case. The user or receiver could be a busy CEO with 5 minutes to get the most important signals. Such use cases require specific visuals highlighting the current state versus the target. Or the user could be a middle manager tasked with looking deeper into the causes of the recent underperformance of an indicator. In this use case, we can use a table with many filters. Another sometimes overlooked channel is the mobile use case. A simple picture of key metric visuals sent regularly to a group chat of executives and managers can do wonders in terms of sending signals as fast as possible.

Conclusion

The data overload highlighted in the Oracle report can be addressed more effectively if we treat dashboards as communication channels. Here’s how:

  • Establish a shared knowledge base between the sender and the user.
  • Enable interactive dimensional filtering with star data models.
  • Transform KPIs into relative forms by comparing them with expected values.
  • Implement an organized system of dashboards with user-friendly navigation.
  • Utilize pre-attentive attributes to establish a visual hierarchy.
  • Reduce visual clutter using Gestalt principles.

Footnotes

[1] Csikszentmihalyi, Mihaly (1990). Flow: The Psychology of Optimal Experience.


This article was edited by Catherine Ramsdell.

The post Making Dashboards Optimal for Human Brain Processing appeared first on Nightingale.

]]>
18669
Step 6 in the Data Exploration Journey: Cut to Realistic Scope https://nightingaledvs.com/data-exploration-cut-to-realistic-scope/ Mon, 21 Aug 2023 12:43:12 +0000 https://dvsnightingstg.wpenginepowered.com/?p=18311 A project to find insights from DVS's State of the Industry Survey data moves from the analysis stages to preparation for build out.

The post Step 6 in the Data Exploration Journey: Cut to Realistic Scope appeared first on Nightingale.

]]>
This article is part 7 in a series on data exploration, and the common struggles that we all face when trying to learn something new. A list of previous entries can be found at the end of the article. I began this series while serving as the Director of Education for the Data Visualization Society, because so many people were asking to hear more about the process of data exploration and analysis. What began as an exploratory project on the “State of the Industry Survey” data grew into a 1.5-year Career Portraits project that produced the 2023 “Career Paths in Data Visualization” report (DVS member login required). This series illustrates how I approach a new project, and the fact that no “expert” is immune from the challenges and setbacks of learning. Let’s see where this journey takes us!

In the previous installment in the series, Jenn Schilling and I had formed a new collaboration to rework my exploratory analysis of the DVS State of the Industry Survey data. After our initial introductory conversations, we needed to get down into the narrowest part of the focus phase: redefining our scope and deliverables in the context of the new project. Once you’ve opened up all of the possibilities and peeked down the many avenues for exploration, there comes a time where you need to pause and decide what you are actually going to build. At this stage in the project lifecycle, your focus should be on getting to the absolute essentials of what needs to happen so that you have a clear path to getting things done.

A lot of people struggle with this part of the process because it requires you to be very disciplined about giving things up and letting go. That can feel harsh or even scary at times. It’s natural to get attached to your ideas and your project, and it can be very hard to leave behind ideas that you’ve invested in. This is one of the best reasons that you should try to keep the initial exploration light and fast. You need to remain flexible enough to pivot before entering the build stage, and it’s a lot easier to make that call if you remember that you’re “just sketching” in the early stages, and avoid becoming too deeply invested in a particular outcome. 

I like to compare the focus stage to pruning plants in the garden. It feels cruel at the time, but a good pruning inspires vigorous new growth, and it’s often necessary to keep a plant healthy and happy. If you prefer the traveling analogy that I used in my previous articles, this is the stage where you lighten your pack before beginning that last, hardest climb. Either way, it’s a good opportunity to trim away the things that aren’t working and to make deliberate choices about how to move forward into the build stage. 

“I like to compare the focus stage to pruning plants in the garden. It feels cruel at the time, but a good pruning inspires vigorous new growth, and it’s often necessary to keep a plant healthy and happy.”

Of course, everyone is different. Some people are very uncomfortable when ideating and prefer to have a clear plan at all times. For them, getting to the focus stage can feel like a real relief. If you find yourself saying that you “don’t have any good ideas,” it may be that you are too good at the focus stage and need to spend some time ideating and playing with your ideas before jumping straight into focus mode. If you’re someone who tends to overcommit and feel overwhelmed by the size of a project, then taking a moment to put things down can be a lifesaver, because it gives you the space to keep your project from spiraling out of control. It’s worth remembering that there are lots of ways to react to the same situation. Try experimenting with unfamiliar approaches to get around your blocks. 

Personally, I don’t like cutting things back, but I do like to focus, and I realize that pruning is crucial to success. It’s really exciting to feel the scope click into place, and it can be such a relief to cut through the noise and end up with a clear plan. Focusing on that desired outcome helps me to push through the cuts that I find harder to make. 

It often helps to distance myself from the project a little bit before beginning the focus stage. At this point, I need to let things take on a life of their own, rather than pushing for that one thing I thought I would make. Every project has an identity and a best expression in the world that is informed by its circumstances, and those may be very different from where you thought you’d end up. If you’ve learned things during the journey, it is natural for your plans to adapt and change accordingly. You can always come back to your initial ideas in another project. Your role at this stage is to give this project its best chance to reach its full potential. Assess where things are, cut anything that no longer helps, and focus on how you can best get to done.

Here’s a checklist for getting through this crucial stage:  

  • Take an honest look at what’s possible—and what’s not. Be realistic about what you can do with the time and the resources that you have. 
  • Re-focus your scope. You started out with one vision in your head, but you’ve learned some things since then. Trim anything that no longer fits.
  • Discard what you don’t need. If you can’t carry it to the finish line, don’t put it in your pack. Be ruthless about what’s really needed, and what can be left behind. 
  • Be prepared to start over. More often than not, I find that the best way to focus your project is to scrap your initial sketches and start over. You have a different perspective now than you did when starting out, and that can help you to put the pieces together in a completely different way. This is especially true if you’re doing any kind of data analysis; that early work harbors hidden mistakes that will trip you up later, so it’s best to start from the beginning and work it through again. 
  • Explore new possibilities. When you’re clear on where you’re going, you’ll find new things that you need to do to get you there. This opens up new opportunities and new things to learn. (For those who are not comfortable with the focus stage, identifying new horizons to focus on can be a huge help.)
  • Prepare to commit. This next stage is all about doing things thoroughly and right. Sketching and ideating can be lots of fun, but the build phase requires you to get busy and roll up your sleeves. The first diamond is focused on exploration and speed, but the second depends on quality, craftsmanship, and discipline to make your project the best that it can be.  

What this looked like in the survey project 

As a reminder, this series began with an exploratory analysis of the DVS State of the Industry Survey to understand the tools that people use in different data visualization careers. As I talked to others at the DVS, my initial project was slowly morphing into something different. I was eager to dig in and follow up on a bunch of interesting leads from the tools data, but the survey contains many questions across all areas of data visualization practice, and we thought it might be more useful to map out a general picture of career paths as a first step, and then come back to the details of tool sets used in different careers later. The new project had a much larger scope that wouldn’t have been possible with my limited R skills and manual Excel manipulation. It was really the addition of Jenn’s skillset to the project that allowed us to make that choice. 

“We needed to set some realistic goals … switching from the ideation phase to the build phase of the project.”

One of the first things Jenn and I needed to do in our new collaboration was to set some realistic goals for our joint project outcomes, switching from the ideation phase to the build phase of the project. We knew that the new project would require a very different analysis of the dataset, so we began by doing a mini-exploration to identify the variables that we had available in the survey, the key questions that we thought we could answer, and to look at some initial values in the dataset. We mapped out the different options, focusing on the variables and analyses that we felt would best support the careers overview we wanted.

Collection of digital post-its in a Mural board, grouped into categories to reflect different sections of the report.
Topic board for planning the structure of the Career Portraits report, showing the individual survey questions and how they inform questions that early career professionals might have when entering the field.

With a clearer idea of the goals and variable space in hand, Jenn and I walked through my odd mish-mash of R code and Excel files to be sure that we understood what we were trying to do, and then we scrapped it all and started fresh. Keeping the old code would only have slowed us down, and it would almost certainly have introduced mistakes. It was a huge learning opportunity for me to see how someone who knew what they were doing in R would approach the same problem. I’d struggled for weeks trying to wrangle the data into the form that I needed (or even to understand exactly what it was that I needed). It turns out that partial pivots and sequential group-by operations are hard to invent from scratch, even though they are very obvious once you know what you’re doing. From the perspective of my learning, the decision to throw out all of my initial work and start over was hands down the most useful part of the entire project. I got to see how an expert approached the problem, and it helped me figure out which fundamentals I’d been missing when trying to learn the software on my own. 

“From the perspective of my learning, the decision to throw out all of my initial work and start over was hands down the most useful part of the entire project.”

As we began re-working the code, we also started to get into deeper detail with the dataset. One of the key things I’d glossed over in the initial exploration was an analysis of statistical relevance in the survey results. If there is one mistake that I see most often from inexperienced folks, it is getting excited about a “story” in the data before checking to be sure that the trend you see is reasonably likely to be real. I wanted a sense of the size and types of variation in the data before worrying too much about the interpretation, but we needed to get our feet back on solid ground before we could go any further. Getting ahead of yourself here risks having your whole project fall apart when the analysis doesn’t hold up. 

The basic test of statistical relevance is this: “Is the difference I see in my data big enough to be meaningful beyond the measurement noise?” If you would see the same difference when randomly sampling different subsets of the dataset or when measuring a different population, then you’ve got a winner. If the measurement vanishes when you sample differently, then it’s likely just an artifact of your method. Without a real statistical analysis, I had no way of knowing whether a 10% variation between two variables was likely to mean something, or whether it was just a blip in the dataset.

In our particular example, this check was complicated by the fact that the survey has multiple branches, so that different people see slightly different versions of the survey with more focused questions depending on their career area. For example, if a respondent answers that they are an employee, they will see questions about their organization size and other items that don’t make sense if they say they are an independent contractor. For us, that meant that there were multiple respondent populations that needed to be evaluated separately. 

Before choosing a statistical method, we looked at the simple response counts across the different career areas and branches in the dataset for each survey question that we wanted to use in our analysis. This gave us a sense of how many respondents we had for each sub-group in the dataset. We flagged anything in red that had counts too low to be valuable, and this assessment reinforced our decision to focus only on the employee branch and a subset of careers. That meant that our Career Portraits would be based mainly on people working as employees and less on those who identified as freelancers, and it would eliminate some careers from our first publication.

This was a difficult decision to make (we wanted everybody to be represented!), but we didn’t think that it made sense to try to compare across the populations and question structures given how varied the response counts were for the different populations. Measuring against fewer than 20 responses for one career group and 700 for another would create large differences in the quality of information reported, and we felt that some of the smaller populations would be better served with separate analyses later on.

Excel table with 13+ questions broken out over 7+ careers, with individual count values for each one.
Excel table of response-counts analysis for the different questions and branches involved in the Career Portraits work. Each cell represents the number of respondents for a particular cut of the data. Blue questions are asked of everyone, purple are only asked of specific branches, and cells highlighted in red are ones where we felt that the counts were too low to support robust analysis or insightful comparisons.

Our survey is a broad-spectrum instrument, designed to give a holistic overview of practices in the field. It isn’t a formal research project, and it’s not intended or structured to answer deep, research-level questions. We also weren’t trying to publish a final statement on universal practices in data vis. For that context, it didn’t really make sense to try to pin down a specific statistical error bar for each comparison that we wanted to make, and we didn’t think that doing so would really help people to interpret the final results. Instead, we focused on a general margin-of-error calculation across all questions. This indicated that a 10% difference was the rough threshold for significant variation, and that gave us a rule-of-thumb guideline for interpreting the comparisons that we wanted to make. 

Next, we created a list of all the new analyses that we would need to support our reduced focus. I did some manual analyses in Excel to get through the initial explorations for those items and started creating an outline for our final report. Jenn took a stab at moving the tools analysis one step further ahead. With her data science background, she was able to add a clustering analysis to the dataset, resolving one of the items that I knew I needed but didn’t have enough technical knowledge to complete. About a month in, we swapped projects: I dug deeper into her clustering work, and she switched focus to work on the main project, whipping through a huge set of analyses in a few weeks that probably would have taken me a year to figure out. 

“There were all kinds of interesting things to tease out of the tools analysis, but we ended up leaving out most of it when we realized how much time it would take to fill in the rest of the report.”

Based on the new analysis and the data coverage considerations, we made much more specific decisions about what to include. We looked at the number of people who answered a question, the amount of variation that we saw in the results, and the relevance of those results to our new project focus. There were all kinds of interesting things to tease out of the tools analysis, but we ended up leaving out most of it when we realized how much time it would take to fill in the rest of the report. 

In the end, we discarded almost everything in my initial tools exploration in favor of the new focus we’d chosen. I don’t consider that to be a problem or a loss, and I don’t think that the initial project was wasted effort. We did what we needed to do to understand the dataset, and then we made tough choices based on where we wanted to go. Rather than being discouraged or disappointed by the outcome, I was excited to start fresh on a new project, and grateful for the early explorations that led us there.

In my experience, this is the most common outcome of an early exploration and refocusing stage: If your exploration is effective, then you almost always end up reframing the question that you set out to ask. (That’s why you do the exploration…it’s the whole point!) Collaborating with Jenn opened up possibilities that I didn’t have when working alone, and connecting with people in the #topics-in-dataviz channel on the DVS Slack and other forums helped us to understand what would be most useful to folks in our community. In the end, we decided on a broader overview rather than a deep dive on tools because we thought it would have the most relevance for the people we wanted to serve. I hope that the tools analysis work will come back as part of another project someday, but even if it doesn’t, my meandering journey gave us what we needed to frame the Career Portraits work, and I count that as a win. 


Previous articles in this series:

Embrace the Challenge to Beat Imposter Syndrome
Step 1 in the Data Exploration Journey: Getting to Know Your Data
Step 2 in the Data Exploration Journey: Going Deeper into the Analysis
Step 3 in the Data Exploration Journey: Productive Tangents
Step 4 in the Data Exploration Journey: Knowing When to Stop
Step 5 in the Data Exploration Journey: Collaborate to Accelerate 

The post Step 6 in the Data Exploration Journey: Cut to Realistic Scope appeared first on Nightingale.

]]>
18311
An Interview with Jordan Morrow, Author of “Be Data Analytical” https://nightingaledvs.com/jordan-morrow-interview/ Wed, 16 Aug 2023 13:12:28 +0000 https://dvsnightingstg.wpenginepowered.com/?p=18305 In his latest book, Morrow explores how data can influence decision making. Here's a peek at what's inside—plus, a Q&A with the author.

The post An Interview with Jordan Morrow, Author of “Be Data Analytical” appeared first on Nightingale.

]]>
Data is a necessary but insufficient ingredient to make strategic decisions. On its own, data is simply recorded observations, often reflected in numbers on a spreadsheet. In order to bridge the gap between data and decision-making, it is necessary to leverage analytics to derive value and insight from the data. That bridge is the focus of Jordan Morrow’s book, Be Data Analytical: How to Use Analytics to Turn Data Into Value, which focuses on how data analytics combined with data visualization can help us make better decisions in both our personal and professional lives.

This book focuses on how data storytelling can influence decision making. As the figure below from the book illustrates, data is the foundational first step in the process but by itself it cannot drive or influence the decision. The middle circle reflects the key bridge whereby data is turned into valuable insight through the analytical process. This insight is then what ultimately helps drive a decision. This book does not provide technical instructions on each of these steps but focuses on the framework and process geared towards professionals who work with data or are interested in working more with data. 

Three words in bubbles. The smallest bubble is "data." The middle-sized bubble is "insight." The third and largest bubble is "decision." In between data and insight is the word "analytics," and in between insight and decision is "framework." Under all three bubbles are the words "Data Storytelling."
Data-driven decision-making train.

The book is structured along the four different stages of data analytics: descriptive, diagnostic, predictive and prescriptive (see Figure 4.1 from the book below). Using the example from the medical profession, Morrow refers to descriptive analytics as a doctor telling you what your symptoms are. Diagnostic analytics takes it a step further and focuses on why your symptoms are occuring. Predictive analytics reflects the medical profession’s research, which tells the doctor which treatments will lead to different outcomes. The final stage of prescriptive analytics is when the doctor would prescribe you with medication to treat the symptoms. Just as with a visit to the doctor, this process is rarely a linear pathway and is often an iterative process where earlier stages are revisited as part of the analysis. 

Rather than providing technical instruction or code recommendations, Morrow focuses his book on providing a high-level framework for how to understand the key questions and value from each of these stages and how it relates to different types of occupations from data analysts to data engineers. Through weaving in both personal and professional examples, the book strikes an effective balance of providing a clear foundation for anyone new to using data while also highlighting critical insights that will be valuable for more seasoned experts.

The four levels of analytics in word bubbles: Descriptive, Diagnostic, Predictive, and Prescriptive. The bubbles are connected by bi-directional arrows in a circular fashion. In the middle of the four words is the word "Iterative."
The four levels of analytics.

While the book is primarily focused on data analytics, Morrow also weaves in discussion on data visualization throughout. One key quote from the book related to data viz is: “Let’s remember that data visualizations are an important part of data and analytics, but they are not the end goal. The data visualizations should be there to help end users get the insight they need to do their jobs better” (pg. 60). Within this framework, the book provides helpful recommendations on how data visualization can enhance the analytics process while maintaining a clear focus on the bridge between data and decision-making so that data viz is a value-add rather than a superfluous distraction.

This book provides a clear focus on the ultimate purpose of data and how it can be useful in driving decisions. I often fall into the trap of assuming that making a data analysis or visualization more technically complex will naturally lead to it being more valuable. Morrow does a great job of deconstructing this mindset and focusing on how different parts of the data analytics process from initial descriptive analytics to more complex prescriptive analytics all have a critical function to play in driving decision making. If you are interested in having a strategic framework to guide how to use data better in your professional and personal life then I would highly recommend giving this book a read.

To learn more about this book, I had the chance to interview Jordan Morrow to ask several questions. See a synopsis of that conversation below:


Joshua Pine (JP): Could you give us a brief introduction to the book from your perspective? What do you see as some of the key insights?

Jordan Morrow (JM): I don’t think I would have ever thought I would write three books and am now writing a fourth. For this book I wanted to continue on the trajectory from my first book which was focused on data literacy and focus on the world of data and analytics. For most people, they don’t need another book about formulas or statistics. I wanted to weave in more than just business examples and share personal anecdotes which people can relate to more. I want people to see themselves in the world of data analytics and provide a conceptual framework. 

JP: What do you see as the overlaps between the worlds of data analytics and data visualization? How can data viz specifically work within the four stages of data analytics (descriptive, diagnostic, predictive, and prescriptive)? 

JM: When you have a dataset to analyze with 50 columns and 100,000 rows, you don’t want to have to manually look through that to find insights. Data visualization is a powerful tool that can spark curiosity, questions, and discussions around diagnostic analytics. Visualizations can bring these analytics to life and can also be part of later steps in the analytics process including predictive analytics models.

How data visualization plays out within the four stages is highly dependent on context and needs. While Excel often gets a bad reputation, it is often sufficient for a lot of visualizations. Sometimes we’re just focused on what is happening and our data visualization can stay within the descriptive analytics space. Other times as we grow in data literacy, we may need a self-serving dashboard that targets the diagnostic or predictive analytics stages. As we know, our visual perception is often the most powerful and when that is harnessed as part of prescriptive analytics or generative AI tools it can help illustrate some really complex topics.

JP: How does data storytelling and data literacy intersect? How can you craft data visualizations that both meet audience members where they are at in their literacy journey while also pushing them to grow and mature?

JM: First of all, we need to get to know our audience members really really well. After that, we need to explore how to integrate education into data visualization, whether that’s in the form of tooltips with additional information or links to guide the user through the process and explain any new or foreign concepts. Another important piece can be to find an accountability partner to bounce ideas off of and to gut check whether the visualization you are creating accomplishes what you’re trying to get at. 

JP: In your book, you emphasize the “human factor” that can foster creativity and contextualize our analytics work. How do we balance the positive aspects of our human contributions while avoiding the dangers of bias?

JM: With the rise of generative AI tools, it is more important than ever to lean into the human element of our analytics and visualization work. As more mundane and routine tasks get automated, we should embrace our creative human side to shape the direction of those tools. In order to do this well, however—while minimizing the dangers of human bias—is where data literacy comes into play. Continuing to grow your data literacy will enable you to better understand what type of insights you’re able to derive from the data and where potential biases may emerge. Another strategy that can help is going back to an accountability partner or mentor who is able to provide candid feedback based on mutual trust and respect. Finding someone in your life who can fill that role can be really powerful in your data journey.

“With the rise of generative AI tools, it is more important than ever to lean into the human element of our analytics and visualization work. As more mundane and routine tasks get automated, we should embrace our creative human side to shape the direction of those tools.”

JP: How should we view the shifts that will happen in the data analytics and data visualization fields due to generative AI? How can practitioners prepare themselves for this new reality?

JM: The reality is that generative AI tools will be disruptive and will inevitably lead to job displacement or, at the very least, it will replace certain tasks within job portfolios. We should embrace this new reality and recognize that it will free us up to engage in deep work and focus on our creative human potential. We should view generative AI as our partner and leverage its technical capacity so we can flourish as data analysts, data scientists, data engineers, and other roles. We should compete with, rather than compete against, these tools.

With regards to the four stages of data analytics, generative AI seems best suited to support the descriptive, predictive, and prescriptive processes. Based on its current capabilities, it does not seem able to fully fulfill the diagnostic phase with a focus on deciphering insights and answering key questions regarding the why behind observations. From a career perspective, it seems that that diagnostic phase may be most valuable for us to lean into right now as we continue to partner with generative AI tools. 


Disclaimer: Some of the links in this post are Amazon Affiliate links. This means that if you click on the link and make a purchase, we may receive a small commission at no extra cost to you. Thank you for your support!

The post An Interview with Jordan Morrow, Author of “Be Data Analytical” appeared first on Nightingale.

]]>
18305