24 January 2019
A new transcontinental project led by EMBL’s European Bioinformatics Institute (EMBL-EBI) will pilot nearly all of GA4GH’s genomic data standards to enable a virtual cohort of more than 1.4 million individuals from Europe, Canada, and Africa.
In particular, the Common Infrastructure for National Cohorts in Europe, Canada and Africa (CINECA) project will utilise GA4GH Data Use and Researcher Identity standards to allow registered researchers to analyse population-scale genomic and biomolecular data through a federated cloud-based network in a way that meets all ethical and security requirements for the international sharing of health data.
“The goal of this implementation is to accelerate the process of accessing datasets in a safe and secure way,” said Thomas Keane, team lead of the European Genome-Phenome Archive (EGA) at EMBL-EBI and co-lead of the GA4GH Large Scale Genomics Work Stream. “All the control of the datasets remains with the local cohorts, as we’re not trying to create a centralised resource, but a federated one.”
Rapid access to clinical research data allows scientists to share their findings and reduce the need to duplicate costly studies. This accelerates research and helps advance benefits to patients through the responsible sharing of genetic, phenotypic, and life-style data on an unprecedented scale.
“By enabling access to genetic data from diverse human populations, CINECA will support the development of treatments tailored to each individual patient’s genetic profile, the ultimate goal of personalised medicine,” Keane said. “Clinicians need to be able to compare a patient’s genome to a large set of healthy people and sick people, in order to understand the underlying genetics of the patient. And by ‘large,’ we mean hundreds of thousands or even millions of other people,” Keane said.
Several of the 18 CINECA partner organisations are active GA4GH contributors, including long-time GA4GH collaborators ELIXIR and H3Africa and GA4GH Driver Projects CanDIG and ENA/EVA/EGA. Spanning three continents, CINECA will use GA4GH standards to bring together data from 11 diverse cohorts in rare disease, common disease, and population health studies.
“The technical goals we have set for ourselves are ambitious,” said Mike Brudno, PI of the CanDIG project, Senior Scientist at The Hospital for Sick Children (SickKids) in Toronto, and a GA4GH Steering Committee member. “But CanDIG has extensive experience working with CINECA partner projects EGA and ELIXIR through their participation as peer Driver Projects within GA4GH. Building on what our projects have already done alone and together, we’re confident that we can not only meet those goals, but build open-source standards-based solutions for the entire community.”
Nicola Mulder, Head of Computational Biology at University of Cape Town and Principle Investigator of H3ABioNet, a Pan-African bioinformatics network for H3Africa, said the project “provides an avenue for us to align with international best practices, and contribute to these from an African and resource-limited perspective. At the same time as contributing our own expertise in working with diverse African genetic data, we hope to gain experience in new technologies for data sharing and clinical implementations.”