The continued decrease in the cost of genomic sequencing has yielded research cohorts of hundreds of thousands of genomes; millions more samples are anticipated in the coming years from both research and healthcare.
However, major barriers in data sharing hinder effective use of the data, including: lack of data sharing mandates, difficulty in submitting data for sharing, inadequate resources for ingesting and storing data, insufficient consent for data sharing, lack of dataset interoperability due to disparate data models and terminologies, inconsistency in data-generating pipelines, and inability to address privacy issues and provide sufficient security.
In order to overcome these barriers, make the most use of the data, and fulfill the human right to benefit from scientific advances as stated in the Universal Declaration of Human Rights, the research and healthcare communities must come together to agree on common methods for collecting, storing, transferring, accessing, and analyzing data. Otherwise, they will remain siloed (e.g., by institution, country, disease area), locking away their potential to contribute to research and medicine. The Global Alliance for Genomics and Health (GA4GH) was established to address this need by cultivating a common framework of standards and harmonized approaches for effective and responsible genomic and health-related data sharing.
GA4GH standards aim to be interoperable with one another and the broader standards ecosystem in order to enable a future in which clinical geneticists can quickly and efficiently search across all of the relevant genomic data to reveal unanticipated gene-disease associations and solve previously impenetrable cases; clinicians can make otherwise impossible treatment decisions by accessing clinical decision support that is based on the world’s best genomic knowledge; basic biologists and common disease researchers can interrogate cohorts large enough to achieve the power to detect all significant contributors to disease; and all qualified researchers—regardless of their means—can participate in genomics at a competitive pace. This ambition depends on data sharing across the globe as well as a federated system for searching, discovering, exchanging, and analyzing genomic and clinical data that is built on standards and interoperability frameworks embraced by the broad genomics and health community.
GA4GH has released 15 standards for APIs, data models, and more since rolling out the initial GA4GH Connect Strategic Roadmap in 2018. Collectively, these standards have been implemented or deployed by over 40 leading genomics institutions around the globe, including ELIXIR, the NIH All of Us Research Program, TOPmed, Genomics England, Australian Genomics, Illumina, Google, and Amazon Web Services. The GA4GH federated Systems Analysis Project (FASP) is an early step in demonstrating how multiple of these standards can be implemented in concert to achieve the ambitious vision described above.
All told, the promise of genomic medicine lies at a crossroads that depends on harmonization across the community to significantly enhance human health and medicine. The following Strategic Roadmap aims to enable that promise.