Data-sharing in a pandemic: even though scientists shared more than ever, it still wasn’t enough

Although countries shared more genetic information on SARS-CoV-2 than in any other pandemic, too many have hidden data, hindering the search for the next variant.

  • 5 April 2022
  • 6 min read
  • by Priya Joi
Photo by CDC on Unsplash
Photo by CDC on Unsplash
 

 

When the SARS-CoV-2 virus was first identified at the start of 2020, its entire genetic blueprint – its genome – was shared online within days. This rapid sharing was instrumental to vaccine development, helping us to produce safe, effective vaccines within just a year.

As important as rapid data-sharing is, at in a rush to disseminate information about COVID-19, preprints have caused many inaccuracies to be published.

Technological advances in data sharing contributed to that, as well as a global sense of urgency in fighting a virus that affected everyone on the planet. But immediate sharing has come with its own risks, say scientists, and two years into the pandemic, it’s clear that many countries have not shared as much information as they could have.

Technological breakthroughs

Advances in the speed and cost of gene sequencing – from room-sized machines to ones that are handheld – were critical in facilitating rapid sharing. Next-generation sequencing allows scientists to generate massive amounts of data. The process involves splitting DNA or RNA into millions of fragments that are sequenced in parallel and then reassembled to form the genomic sequence. This technology has only taken off in the past five years or so, according to Professor Anne-Mieke Vandamme from the University of Leuven, Belgium.

In 2015, when Professor Vandamme and her team led a project called Virogenesis to develop new tools to help analyse and interpret the data that comes from sequencing, only a handful of research labs were using the technology. It is now standard practice.

One tool they developed was called Genome Detective, which can take raw data, filter out any non-viral sequences, reconstruct the genome and identify the virus, including identifying viruses never seen before.

Laboratories in low-income countries are often without the resources to undertake gene sequencing and tend to have to rely on better-resourced countries taking the lead in analysis of new viruses and their evolution around the globe. But new technology has not only meant higher throughput sequencing, but cheaper, handheld sequencing devices.

Nanopore sequencing is a technology that can be packed into a portable device that can read complex strings of RNA or DNA with a laptop. Given that specialist labs aren’t needed, researchers have been able to sequence viruses like SARS-CoV-2 in countries from Bangladesh to Zambia.

Because sequencing wasn’t limited to countries in Europe or the USA, scientists have been able to put together a picture of virus evolution and movement through the world. For example, the sequencing of Gambian genomes early in the pandemic showed they were similar to European and Asian genomes, which indicated that SARS-CoV-2 had been imported into Africa.

The ability of Africa and Asia to contribute significantly to sequencing efforts has led to major public-private partnerships such as the African Pathogen Genomics Initiative and the Indian SARS-CoV-2 Genomics Consortium, which should mean better data-sharing in future infectious diseases outbreaks.

Quality over quantity

The rise of new technology to share raw data in real-time has led to the sharing of research based on that data, ushering a new era of more non-peer reviewed science being shared than ever before. Of the nearly 20,000 articles published in the first four months of the pandemic, a third were preprints.

Traditionally, scientific journals have taken weeks or months on peer review – in which scientists carefully review research articles written by other researchers – to ensure that any data put out into the world is verified as much as possible. However during the pandemic the responsibility to share accurate information has often been overtaken by the ethical responsibility to share information that could help response efforts “without waiting for publication in scientific journals” according to the World Health Organization (WHO) Working Group on Ethics and COVID-19.

But as important as rapid data-sharing is, at in a rush to disseminate information about COVID-19, preprints have caused many inaccuracies to be published, say researchers writing in the journal BMC Medical Ethics. They add that this has triggered “inappropriate changes in clinical care, ineffective public health responses, and increasing anxiety in communities”.

Countries that raise the alarm about a new variant are often heavily penalised for it by trade and border closures – as South Africa was with Omicron – even if the variant didn’t originate in that country and even when the variant is in heavy circulation elsewhere.

Dr Maria Van Kerkhove, WHO’s COVID-19 lead, told Nature that overall she believes preprints have been a positive force for speeding up research in the pandemic, but “for many, I think the jury is still out on how helpful [preprints] are because they can be quite damaging,” she says. “They can misdirect a policy or they can lead you astray if you don’t stay rooted in the totality of the science.”

Hidden data

Despite the appearance of more data being shared than ever before, it is also clear that many countries have been selective in what they share, and far more needs to be done to improve the timeliness of information sharing, as well as ensuring that sequences are shared in full.

Researchers writing in Nature Genetics undertook a global landscape analysis on SARS-CoV-2 genomic surveillance and data on the Alpha and Delta variants of the virus. They found only 23 (37%) of 62 countries reported on variants of concern, and these countries shared less than half of their sequences on these variants in public repositories. One quarter of countries uploaded fewer than 25% of their sequences. Although some countries did not have the capacity to share adequate data, they found that even well-resourced countries shared less than they could have.

This means that with a virus that is still rampantly circulating around the world, with the potential to evolve into a more pathogenic variant, gaps in data mean our ability to foreshadow it is limited.

One reason for this could be that countries that raise the alarm about a new variant are often heavily penalised for it by trade and border closures – as South Africa was with Omicron – even if the variant didn’t originate in that country and even when the variant is in heavy circulation elsewhere. “Most countries that share those data usually are made to suffer for it,” Nnaemeka Ndodo, a molecular bioengineer at the Nigeria Centre for Disease Control in Abuja told Nature.

Countering the threat of new variants of concern will take even more international cooperation and collaboration if we are to ensure the timely and complete sequencing of SARS-CoV-2 genomic data in all countries.