Five journalists spoke to CPJ about the challenges they face reporting on COVID-19 data. Clockwise from top left: Darren Long, South China Morning Post creative director (Antony Dickson); Rodrigo Menegat, data journalist at Brazil's Estadão (Bianca Menegat); Mariano Zafra, storytelling and graphics editor at Spain's El País (Uly Martín); Allison McCann, New York Times reporter and graphics editor (Cassandra Giraldo); Iranian journalist Denise Hassanzade Ajiri (Radio Free Europe/Roozbeh Bolhari)

Data journalists describe challenges of reporting on the true toll of COVID-19

How many people worldwide have been infected by the coronavirus, and how many have died as a result? Finding reliable information on the virus’s toll has proven such a challenging task that it is nearly impossible to answer these basic questions, five data journalists from around the world told CPJ in May and June.

In some countries, journalists say that governments have deliberately shielded figures from the public, while in others, insufficient data forces journalists to sift through multiple reports to find trends. In response, journalists have turned to creativity, collaboration, and hidden sources to illuminate the coronavirus’s impact, in many cases exposing governments for distorting some official reports along the way.

Since the beginning of the COVID-19 outbreak in Wuhan, China, Darren Long, creative director at the South China Morning Post — an English-language newspaper based in Hong Kong — treated data from the Chinese government with skepticism, he told CPJ. He said the data from the mainland has been useful for illustrating general trends in the way the virus has moved through the population. But he called the official figures “misleading” overall because of holes in the data — some his team knew about, and others they didn’t.

For instance, his team knew that the government did not include asymptomatic cases in the total case count until April 1, but it didn’t know if the government counted victims with underlying conditions as COVID-19 deaths (as some other countries did not). This issue — of not knowing what they didn’t know — was exacerbated because Chinese censorship of the internet made it difficult to verify figures, Long explained. As CPJ has documented, the Chinese government has forced news outlets to take down articles and reports that contradict the state’s narrative of combatting the virus successfully. 

In order to produce its in-depth, data-rich work, Long’s team used social media platforms to learn what people on the ground in Wuhan were saying and also cross-checked municipality data against central government data. In addition, the team tried to compare English translations to the original Chinese-language reports to make sure they were accurate because, according to Long, the Chinese government often finds ways to make its English reports “reflect better” on the country.

A Chinese government spokesperson did not reply to CPJ’s email request for comment.

When working with coronavirus data in Brazil, journalists say they must navigate apparent government attempts at deliberately misinforming the public. Rodrigo Menegat, of São Paulo-based news website Estadão, said President Jair Bolsonaro’s government has attempted to strategically “downplay” the severity of the pandemic through the data it releases. 

In early June, the Brazilian health ministry stopped releasing historical data on the total number of COVID-19 deaths and cases in the country, and instead only released data on new deaths and cases from the previous 24 hours. The country began releasing the cumulative data again after an outcry from citizens and an order to do so by Brazil’s Supreme Court.

But even with the Supreme Court ruling, Menegat has had trouble accessing information due to the Brazilian government’s restrictions on public records. In March, the government passed a measure that temporarily suspended deadlines for public authorities to respond to information requests. The measure was then suspended three days later by the Supreme Court, as CPJ has documented. Still, Menegat and his team have been unable to access Brazilian data about testing, the demographics of the people that are falling victim to the disease, or how the country’s total mortality rate has changed during the pandemic.

“There is a lot of data that we should have in order to better assess how severe the situation is in Brazil, but we simply don’t have it,” Menegat explained.

In a statement sent to CPJ by Brazil’s Ministry of Communication and Ministry of Health, the Brazilian government said that it has made “all the information, including data and numbers” on COVID-19 in the country available to the public, and that the government has been working to widen access to that information by developing new platforms. The statement did not respond to questions about misinformation in the COVID-19 data or public records restrictions.

Denise Hassanzade Ajiri, an Iranian journalist living in the United States who is reporting on the coronavirus in Iran, has also found data about the coronavirus coming out of the country to be neither reliable nor readily available. “Usually finding data within Iran is a bit complicated, but not impossible,” she said. “But when it comes to stories related to the coronavirus, everything is a mess. The government doesn’t really provide data.” 

When Ajiri does access virus data from sources inside the country, she doesn’t trust it, saying the government “deliberately spreads misinformation.” CPJ has documented that journalists in Iran have been ordered to only announce official virus data, even after news reports of mass graves emerged, suggesting that the death toll was in fact much higher than the official count.

An Iranian government spokesperson did not respond to CPJ’s email requesting comment.

Allison McCann, a reporter and graphics editor for The New York Times’ International Desk and a contributor to the Times’ COVID-19 open database — a public repository of coronavirus data that the Times uses in its reporting — said she and her team have found that certain countries appear to be “obfuscating” the COVID-19 data they release by choosing not to test for the virus or releasing incomplete data, pointing to Turkey and Russia as two examples.

But it is not only in countries typically known for a lack of access to information where journalists must chase sources and cross-check official numbers, McCann said. In the United States, collecting and verifying data about the virus is challenging due to the fact that at the local level there are sometimes multiple government bodies reporting different data sets — for instance, municipalities and counties typically have their own reporting systems. “Most of the world has one central health care system or one centralized health care body, so it’s actually much easier to track cases and deaths,” she said.

In Spain, the government sometimes releases data on the number of people who have tested positive on viral tests, while at other times it also includes the number who have tested positive on antibody tests, said Mariano Zafra, storytelling and graphics editor at Spanish daily El País. Other reports contain different figures, such as the number of asymptomatic cases. The constant changes “hinder good analysis and projections,” he said. To complicate things further, national and local data sets often do not add up, with national authorities reporting far fewer deaths from the virus than the total number reported by local authorities, Zafra said.

The spokesperson for the Spanish Ministry of Health, Consumer Affairs, and Social Welfare did not reply to CPJ’s email request for comment.

Amid the challenges, however, journalists have creatively managed to continue to inform readers. At El País, journalists with different specialties are sharing expertise. Zafra recently published a story with science journalist Javier Salas on how the virus spreads in indoor spaces by illustrating a scientific article. And at the Times, McCann and her team have found that documenting the fact that there are more deaths now than in typical times — a measure known as “excess mortality” — can help illustrate the virus’s impact.

Journalists also rely on data experts to round out their reporting. Early on in the pandemic, Johns Hopkins University in the United States developed a COVID-19 database and tracker as an international tool to track the spread of the pandemic in near real-time. The database, which consolidates global coronavirus data, is widely cited by reporters, including both McCann and Long. But Long said that even though the tracker is “superb” it still contains information that can be “very misleading” given that the countries covered by the database vary in terms of confirming cases, testing capacity, and transparency.

A Johns Hopkins spokesperson told CPJ in a statement that the database team continually vets the data in order to present the most accurate information reported by public health agencies. However, their data tracking effort “can only be as good as the information coming out of public health agencies,” the spokesperson explained. In places where the Times has reporters on the ground, McCann said her team works to “verify and confirm and improve” the Johns Hopkins data.

Journalists around the world have also built tools that they hope will make COVID-19 data more reliable, accessible, and accurate for all. In Greece, the journalistic nonprofit organization iMEdD, which supports a wide community of independent media professionals, created an open source database with COVID-19 data broken down by Greek regions. And, Code for Africa, a civic technology and data journalism network across Africa, is working to build a similar database.

Turning messy, incomplete, and unreliable data into credible journalism is often grueling work, according to the journalists interviewed by CPJ. In some places, journalists are still fighting with their governments to access any data at all. Frustrating as this is, journalists said they are most concerned about how the lack of reliable data may put human lives in danger. Without understanding the virus’s true impact, they explained, the public is unable to make informed decisions during the pandemic — decisions that have the potential to make the difference between sickness and health.

“I can’t think of anything in my life as a journalist when we were trying to collect data like this for the world at the same time,” said McCann of the Times. “And now, the stakes are so much higher.”

[EDITOR’S NOTE: The ninth paragraph has been corrected to reflect how long the Brazilian government’s measure that suspended deadlines for public authorities to respond to information requests was in effect.]