Illustration by Kyle Webster

Data visualization has arguably been the star of coronavirus pandemic coverage. From early graphics urging us to flatten the curve, to John Burn-Murdoch’s Financial Times charts, to regularly updated dashboards like the Johns Hopkins COVID-19 dashboard, we’ve been inundated with visual interpretations of the pandemic data.

Screenshot of the John Hopkins Global Covid-19 data visualization dashboard.
The Johns Hopkins COVID-19 Dashboard aggregates data from sources such as the World Health Organization, U.S. Centers for Disease Control and Prevention, the European Center for Disease Prevention and Control, and several others. Image credit John Hopkins.

“The pandemic is one of the first times in my living memory where so much data is publically available during a crisis,” points out Shirley Wu, a data visualization designer working in the Bay Area. People are so close to the numbers and are engaging with them on a daily basis. “There’s really no way to communicate the pandemic other than through data,” adds Jane Zhang, a data viz designer based in Toronto. Suddenly, design and data are a central part of people’s lives.

Charts and graphs can feel like a comforting way of making sense of complex and overwhelming data sets. When dealing with quantitative data that is so large, (with case numbers in the millions at time of writing), data visualization helps people to feel like they can understand the world around them. For Wu; “Data visualization is one of the fastest and most efficient ways to wrap your mind around large amounts of data.”

However, the reality of both the role of data and the role of data visualisation during the coronavirus public health crisis is much more complex and nuanced than it might seem at first glance.

Why data viz has played a prominent role in pandemic coverage 

An interactive Covid-19 data visualization by the Financial Times.
John Burn-Murdoch’s charts in the Financial Times are the most read pages in the history of the Financial Times. An interactive version allows comparison between countries, and provides detailed information on how cases and deaths are counted, the difference between logarithmic or linear scales, and data sources. Image credit Financial Times.

Fundamentally, “data visualization is what it sounds like – it’s basically visualizing data,” says Zhang. “Usually the data is quantitative, and the goal is to show it graphically. One of the main ways that data viz creates insight is by creating comparability between data.” At its core, data visualization is a way to communicate and make sense of information. As people going through a very challenging global health crisis, we are craving narratives and information to help us understand what is happening. This is why we see the popularity of graphs that compare COVID-19 data from different countries. 

Animated flatten the curve comic strip by Dr. Siouxsie Wiles and Toby Morris.
Dr. Siouxsie Wiles and Toby Morris’s comic strip version was one of many widely shared graphics that aimed to explain the concept of ‘flattening the curve’ through a simplified chart. Image credit Wikimedia Commons.

Beyond communicating, data visualization also plays a role in helping to convince people to change their behavior. Once a virus is spreading, public health officials need to make critical decisions about how much to communicate and at what point. “One of the most important aspects of containing an outbreak is all about convincing people to change their behavior when it’s not immediately clear they should do something yet. Data visualization has been so important in communicating and convincing,” according to Wu. Some of the early graphics that became widely shared and influential during the initial stages of the pandemic in North America and Europe urged people to do their part by succinctly explaining the concept of ‘flattening the curve’.

Wu felt compelled to take some kind of action in convincing people to stay home, and built a game called “People of the Pandemic,” in which players can experience the impact of their choices on the number of available hospital beds in a community. The game is based on data models, and uses data visualization and storytelling skills to bring the data to life. “We really wanted to help people to connect their every day actions to the possible consequences. The game tries to do this by hyper-personalizing the player’s situation to their local community, and tying the player’s actions to outcomes in the game.” 

In the People of the Pandemic game, a player inputs their zip code or choses a community size in order to make the game play more personalized to their real life situation.

Creating the game took six weeks, and highlighted some of the many challenges in effectively visualising data during the pandemic. Wu and her team grappled with many questions, such as the methodology behind the game logic and simulation, working with experts to vet the choices in the game, and ultimately understanding the complexity and nuance of the data sets that inspired the game.

The challenges of visualizing the uncertain

Behind every chat, graph, or model that you see during the pandemic is a complex set of choices and tradeoffs that the designer has made to get to a final product. Amanda Makulec is a data viz lead at Excella, and has a public health background, which informs her understanding of the tradeoffs that are being made; “So much about the case data and even the deaths data has so much uncertainty around it. We have an expectation these days that data is instant, but the reality is that case reports for COVID-19 data move through a somewhat manual process.” For example, there’s a lag from when a test is taken to processing the test in a lab to when the result gets recorded in the official counts in a local or national repository.

While many graphs emphasize comparability, Makulec points out that definitions are  inconsistent across states and countries, for example, what counts as a death attributed to COVID-19. Cases are also a function of the rate of testing that is happening, which varies widely in different countries. In addition we have to consider data quality, when“at every level where you are aggregating data, small data quality issues at the lowest level get compounded as you aggregate up and up. This is why the CDC will say trust the data at your local administrative level first.”

Similarly, when talking about the process for the “People of the Pandemic” game, Wu reflected on the importance of “anyone who is thinking about visualizing pandemic data becoming intimately aware of the data collection challenges. Not every country collects data the same way, not every country has the same metric on who gets tested, different countries have different requirements on what counts as a COVID death.” 

For Wu and her team, these challenges informed a crucial decision in their game. “This is part of the reason why we decided not to use current data because there were so many grey areas. We had to go with a fictional virus. We were trying to balance and juggle the simulation and trying to get those underlying numbers as close to what we know in reality, but being careful not to say we are trying to simulate the virus itself.”

In People of the Pandemic, data visualization shows the potential impacts of players’ choices, visualized in infected cases by age group, day, and available hospital beds.
During gameplay in People of the Pandemic, data visualization shows the potential impacts of players’ choices, visualized in infected cases by age group, day, and available hospital beds. Due to how new and rapidly changing COVID-19 data is, the game visualized a virus, rather than specifically the novel coronavirus.

Perhaps the biggest challenge for data visualization designers during the pandemic is how quickly the data changes and how much uncertainty we are dealing with. Part of the comfort and popularity of data visualisations during this time is the perception that it makes us more certain, and it makes the data seem more certain. “Visualizing data makes it seem more certain, but with the coronavirus data there is so much unknown unquantified uncertainty. What do we do when we have people demanding more visualisations of data that is very uncertain? We’re really bad at visualizing uncertainty,” reflects Makulec.

Becoming better data viz creators and consumers

We like certainty. We like feeling like we understand, like we have some kind of control. Wu reminds us that “A lot of people who don’t do data viz won’t know to look into the methodologies. They will see an official looking chart and take it as a hard fact.” Herein lies the risk for both data viz creators and consumers.

This issue of trust is key, from Zhang’s perspective. She mentions the issues around trusting data and how it can be manipulated or misrepresented. “There’s a great book by Alberto Cairo called How Charts Lie that talks about how people misrepresent data – sometimes intentionally or sometimes unintentionally due to lack of experience.” Wu agrees “We’ve known for a long time how easily statistics can mislead, and we’ve seen how charts can mislead. There’s a risk if it’s not labelled or titled correctly, if it’s not annotated correctly, if the axes are misleading.”

When the stakes are so high as informing public opinion and action during a global pandemic, designers have a big responsibility for the graphics they create. All three designers highlighted this responsibility several times. All are members of The Data Visualization Society, of which has provided guidance on this responsibility through a piece Makulec wrote on ‘Ten Considerations Before You Create Another Chart About COVID-19.

The main message is ‘don’t viz irresponsibly.’ “Good intentions are not enough, you can still do harm. Follow a ‘do no harm’ principle and be honest about where we can and can’t help as data viz designers,” advises Makulec.

For Wu, partnering with experts was a non-negotiable part of the approach to creating the “People of the Pandemic” game. “When I started creating the game, I knew I couldn’t do it alone, and that I needed to partner with experts.” She worked together with spatial data specialist Stephen Osserman, and they both consulted with Dr. Sidney Bell, a computational biologist with a doctorate in virus evolution. The team also publically shared their model and assumptions underlying the game play in detail.

“Data visualization is interdisciplinary, and requires an understanding of the data as well as an understanding of design,” says Zhang. “You have to think about the end person who will be consuming this visualisation.” In this way, data viz is like many design disciplines that requires a focus on the end user, equipping them and educating them. Accessibility of data visualisation has also been an important topic during the pandemic, with much work to be done on ensuring data visualizations and charts are accessible to all.

For those of us consuming data visualizations as part of pandemic coverage, it is crucial to be informed and think critically. “Look to reliable sources that accurately and adequately cite their sources, and that provide a clear definition of what’s being plotted. Look for people who are honest about limitations of the visualizations and data,” says Makulec. “Some additional cues to look out for include annotation layers that make sense of weird blips or anomalies. Look for sources that are honest about the limitations of the data, and the caveats regarding the data.” In order to go beyond taking charts at face value, the responsibility is also on the reader.

Remembering to make the data more human

Reuter's data visualization illustrtrating Italy's Covid-19 deaths betweenm March 21-22.
The Reuter’s Graphics team’s data driven reporting attempts to bring the human side of the data to the forefront, reminding us that case and death numbers represent people. Image credit A deluge of death in Northern Italy.

“There’s so much data readily available, and that’s both a blessing and a curse,” says Wu. Now, more than ever, we need clear data visualization design that doesn’t mislead or misinform, and doesn’t add to the noise. For data viz designers, there is a huge responsibility to be thoughtful in the part they are playing in informing public opinion.

No matter how helpful we find charts and graphs, Zhang highlights that it’s also worth remembering that in the end, as humans, “we’re not driven by numbers, we are driven by emotions.” And as much as this public health crisis has been one reported in terms of statistics and numbers, Makulec wants us to remember that this is a very human crisis. “The very best visualisation are the ones that force you to think about the human side of things.”

Data visualization is one more way for us to try to make sense of the complex, nuanced, and uncertain crisis that we are all living through. It comes with its own pitfalls, benefits and complexities.