Pandemic Information Sharing: EMBL-EBI’s COVID-19 Portal, GA4GH Take On Data Sharing – A number of information types such as genetic, protein, microscopic and scientific literature.
In earlier this week, EMBL unveiled its COVID-19 Data Portal, a broad variety of datasets including genomic, protein, and microscopy and scientific literature. The data portal is available to the European Bioinformatics Institute (EBI).
The European COVID-19 data platform is one of the 10 projects listed in the first ERAvsCORONA action plan launching by the European Commission and is part of the EMBL-EBI, ELIXIR, and Commission partnership.
“In some sense, the lead partner is the European Commission,” Ewan Birney, Director of EMBL-EBI told Bio-IT World. “They’re both good funders and have been incredibly good colleagues in designing this. We’re sort of the technical engine.”
EMBL-EBI has a 30-year tradition of providing scientists with data sharing, said Birney. “I think people outside molecular biology do not understand that data on the reporting of results is the norm in molecular biology. It has led to some very robust systems even for data sharing, predating the internet — particularly for the structures of proteins and DNA sequences, “says Birney. “We have developed data sharing for other forms of data such as proteomics and expression data for these two traditions — protein structure and DNA sequence.”
The COVID-19 Data Portal is the entry point for the larger European initiative COVID-19 Data Platform; many SARS-CoV-2 Data Hubs have been developed. Once it’s designed, the hubs will coordinate the streaming of outbreak data and provide the European and global research communities accessible from the data portal with a full open data sharing network. The developed EMBL-EBI data infrastructures are being used by both the European COVID-19 Data Portal and the SARS-CoV-2 Data Hubs.
In addition to genomic data from the outbreak, the COVID-19 Data Portal contains datasets from many EMBL-EBI repositories, including the European Nucleotide Archival (ENA), UniProt, Protein Data Bank in Europe (PDBe), the Electron Microscopy Data Bank (EMDB), Expression Atlas, and the Europa PMC data series. Birney acknowledged that this is an alphabet soup of partners.
The COVID-19 data portal provides an easily available, intuitive and user-friendly interface for data analysis, visualization software by EBI for analyzing the data. Additional datasets and tools will be added to the COVID-19 data platform in the coming weeks, with support from ELIXIR and other employees.
“I can’t tell you how proud I am of everyone working so hard on the [Portal] EMBL-EBI,” Birney told the party. “We have had to do a lot; it looks pretty easy, but behind scenes there’s a great deal of re-drawing to gather this data from a biological problem perspective, and they’ve really worked their socks off in a really challenging situation where they had children at school, at school, and all that.”
Database Glut – Pandemic Information Sharing: EMBL-EBI’s COVID-19 Portal
The new data repository joins a variety of other worldwide repositories and repositories for the awareness, detection, treatment and subsequent vaccination of SARS-CoV-2. Birney is not worried about a misunderstanding or lack of work in data set volumes.
“Many, many different websites, I don’t worry too much. I assume that the data have several different views and numerous topics to explore, “he said. “But the same data flow across the world is what we really should campaign for: using the same basic open data system and utilizing the data technology that we have built over the last 30 years.”
Birney also serves as Chair of GA4GH, the Global Alliance for Genomics and Health, which develops standards and harmonized approaches for effective and responsible genomic and health-related data sharing. GA4GH has been working since 2013 to develop and disseminate data sharing and use standards, and Birney advocates for applying those standards to COVID-19 research.
“We need the same standards,” Birney said. “The way Genomics England stores the data has to be the same way as Finland, the same way as the US, otherwise every time you go somewhere, you have to rethink everything. That would be a complete nightmare!”
Data In The Time Of Corona
Yet we are in a unique situation in history, with new external constraints on data sharing and the use of data.
Heidi Rehm, one of two GA4GH co-chairs and medical Director of the Clinical Research Sequencing Network at the Large Institute for Spate of New Databases and Resources, said, “I think that what you see is that people say, “Look, we must get the information out of there, regardless of the format, and use it.’
We need a ‘flexibility and real sharing balance at the moment,’ said Rehm. “When [researchers] go in and hand-held other items because this does not suit the format, there’re plenty of people who are able to take this time in now,” she says, theorizingly. Researchers, businesses and other data holders are realistic. Let’s just put it out, even though it’s not in peace, “she observed.
There is no danger to this industry-wide data dump. The compilation and analysis of data from multiple sources contributes to the possibility of misinterpretation and false associations — in certain cases with severe consequences.
“We do have to be careful that we take the same scientific rigor when we’re making claims about things,” she warned, “[While,] at the same time, coming up with hypotheses that warrant further investigation. Any way we can get those hypotheses and early evidences out there, the better!”
The job of standards is not scientific rigour, Rehm points out. GA4GH Standards focus on data formats and file types, how structured APIs can be built for data sharing, data usage and research identification mechanisms to ensure efficient access to data, CRAM and BAM file formats, software container systems, etc. “A lot more practical in many respects are these things,” Rehm said.
But they are all there to facilitate the sharing of data.
“We’re looking to encourage very rapid data sharing in all domains that relate to this outbreak,” Rehm said. “Everything from results being seen from viral testing so we know who’s contracting the virus and where to proper tracing and assess risk across the population. Some of that is not scientific discovery, it’s literally a public health emergency.”
Original post: Data Sharing During A Pandemic
Pandemic Information Sharing: EMBL-EBI’s COVID-19 Portal