Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Our Strategic Road Map defines strategies, standards, and policy frameworks to support responsible global use of genomic and related health data.
Discover how a meeting of 50 leaders in genomics and medicine led to an alliance uniting more than 5,000 individuals and organisations to benefit human health.
GA4GH Inc. is a not-for-profit organisation that supports the global GA4GH community.
To guide our collaborative, globe-spanning alliance, GA4GH relies on a Standards Steering Committee and an Executive Committee.
The Funders Forum brings together organisations that offer both financial support and strategic guidance.
The EDI Advisory Group responds to issues raised in the GA4GH community, finding equitable, inclusive ways to build products that benefit diverse groups.
Distributed across four Host Institutions, our staff team supports the mission and operations of GA4GH.
Curious who we are? Meet the people and organisations across six continents who make up GA4GH.
More than 500 organisations connected to genomics — in healthcare, research, patient advocacy, industry, and beyond — have signed onto the mission and vision of GA4GH as Organisational Members.
These core Organisational Members are genomic data initiatives that have committed resources to guide GA4GH work and pilot our products.
This subset of Organisational Members whose networks or infrastructure align with GA4GH priorities has made a long-term commitment to engaging with our community.
Local and national organisations assign experts to spend at least 30% of their time building GA4GH products.
Anyone working in genomics and related fields is invited to participate in our inclusive community by creating and using new products.
Wondering what GA4GH does? Learn how we find and overcome challenges to expanding responsible genomic data use for the benefit of human health.
Study Groups define needs. Participants survey the landscape of the genomics and health community and determine whether GA4GH can help.
Work Streams create products. Community members join together to develop technical standards, policy frameworks, and policy tools that overcome hurdles to international genomic data use.
GIF solves problems. Organisations in the forum pilot GA4GH products in real-world situations. Along the way, they troubleshoot products, suggest updates, and flag additional needs.
NIF finds challenges and opportunities in genomics at a global scale. National programmes meet to share best practices, avoid incompatabilities, and help translate genomics into benefits for human health.
Communities of Interest find challenges and opportunities in areas such as rare disease, cancer, and infectious disease. Participants pinpoint real-world problems that would benefit from broad data use.
See all our products — always free and open-source. Do you work on cloud genomics, data discovery, user access, data security or regulatory policy and ethics? Need to represent genomic, phenotypic, or clinical data? We’ve got a solution for you.
All GA4GH standards, frameworks, and tools follow the Product Development and Approval Process before being officially adopted.
Learn how other organisations have implemented GA4GH products to solve real-world problems.
Help us transform the future of genomic data use! See how GA4GH can benefit you — whether you’re using our products, writing our standards, subscribing to a newsletter, or more.
Help create new global standards and frameworks for responsible genomic data use.
Align your organisation with the GA4GH mission and vision.
Solve your real-world data problems with support from this valuable network of global institutions.
Work with like-minded groups committed to better data use in areas like rare disease, cancer, and infectious disease.
Share your thoughts on all GA4GH products currently open for public comment.
Solve real problems by aligning your organisation with the world’s genomics standards. We offer software dvelopers both customisable and out-of-the-box solutions to help you get started.
Learn more about upcoming GA4GH events. See reports and recordings from our past events.
Speak directly to the global genomics and health community while supporting GA4GH strategy.
Be the first to hear about the latest GA4GH products, upcoming meetings, new initiatives, and more.
Questions? We would love to hear from you.
Read news, stories, and insights from the forefront of genomic and clinical data use.
Attend an upcoming GA4GH event, or view meeting reports from past events.
See new projects, updates, and calls for support from the Work Streams.
Read academic papers coauthored by GA4GH contributors.
Listen to our podcast OmicsXchange, featuring discussions from leaders in the world of genomics, health, and data sharing.
Check out our videos, then subscribe to our YouTube channel for more content.
View the latest GA4GH updates, Genomics and Health News, Implementation Notes, GDPR Briefs, and more.
Discover all things GA4GH: explore our news, events, videos, podcasts, announcements, publications, and newsletters.
12 Oct 2021
The GA4GH October Connect meeting occurred from 12 to 14 October 2021 to provide opportunities for collaboration across the GA4GH Work Streams, Driver Projects, and the broader community and to support contributors in advancing work on the GA4GH Roadmap. Read more below.
Opening Remarks: Building Momentum on Implementation
GA4GH Chief Standards Officer Susan Fairley shared an overview of GA4GH and next steps to accelerate our work and implementation of GA4GH standards, emphasizing the importance of applying genomics into healthcare, enabling global interoperability, and putting GA4GH standards into practice across a broad range of settings. Several GA4GH initiatives are supporting these efforts, including the Federated Analysis Systems Project (FASP), the Technical Alignment Sub Committee (TASC), and the GA4GH Starter Kit. The expansion of the GA4GH Technical Team will also help to accelerate these efforts, providing additional capacity and support to the contributor community. The session then continued with updates from all GA4GH Work Streams and Initiatives.
Work Stream Updates
All the GA4GH Work Streams, the Federated Analysis Systems Project (FASP), and the Equity, Diversity, and Inclusion (EDI) Advisory Group gave updates on their initiatives and discussed the sessions they will host at GA4GH Connect.
Opportunities for Collaboration: Presentations from External Initiatives
The Human Pangenome Reference Consortium (HPRC) aims to improve representation of sequence diversity in the human population by sequencing 350 diverse human genomes and creating a comprehensive map of genome variation. While currently using GA4GH-maintained standards such as VCF, BAM, and CRAM, HPRC hopes to explore variation representation with the GA4GH Genomic Knowledge Standards (GKS) Work Stream and seek global partnerships to increase sample diversity, share data, and coordinate on standards.
PHA4GE aims to improve mechanisms for protected pathogen data sharing and standardize approaches to compute in this space. Through the work of several working groups, PHA4GE has adapted GA4GH’s approach but for the pathogen genomics community, with a focus on bridging research with public health and government.
CINECA aims to deploy GA4GH standards to enable human cohort interoperability through a paradigm of federated analysis. The group aims to tackle a series of challenges, including: federated data discovery, interoperable authentication and authorization, harmonized cohort level metadata, federated analysis interoperability for research and healthcare applications, and trans-national harmonised ELSI framework.
Cloud Work Stream
The Cloud Work Stream identified major roadmap items and specification updates across all their APIs. The Workflow Execution Service (WES) API team aims to examine feasibility of generic WES input and output formats and support for GA4GH Passports; the Task Execution Service (TES) API team will continue non-breaking improvements to current AI capabilities, along with supporting the GA4GH Passports standard; the Tool Registry Service (TRS) API team aims to explore TRS URIs; and the Data Repository Service (DRS) API team will continue iterating on proposals for batch requests for DRS objects and pagination of results.
Pedigree Standard Implementation
The GA4GH Pedigree team demonstrated use of the draft Pedigree standard in data collection, management, and analysis tools. In terms of next steps, the team aims to share the minimum core dataset for feedback and approval; demonstrate use within the GA4GH Phenopackets standard; submit Pedigree Standard for GA4GH approval; and extend converter tooling to support additional formats.
Federated Analysis Systems Project (FASP)
The FASP team aims to collect use cases for 2022 and beyond, with a goal of collaborating with groups such as NCPI and asking permission to use and expand use cases. In particular, the team hopes to expand geography and participation of their vertical and horizontal demos shown previously, to continue promoting the theme of same compute in different places. To support these efforts, FASP hopes to collaborate more closely with the Technical Alignment Sub Committee (TASC) and the other GA4GH Work Streams to develop and harmonize testbeds and API testing across GA4GH.
The newly-formed Cohort Representation subgroup presented a landscape review and path forward. The group has decided to split into two “pizza teams” to focus on key topics: 1) minimal information for a computable cohort, which will aim to describe sets of attributes required to describe a cohort consistently; and 2) Phenotypes, which will look at existing phenotypic standards and combining phenotypic information into cohort representation.
The Genetic Discrimination Information Document—a GA4GH deliverable in development—was reviewed at the meeting. The group discussed strengthening the document by extending the consent clauses section to encompass all tools and resources relevant to anti-discrimination on the basis of genetics. The team also discussed the ongoing Delphi Study, in which analysis of round 1 results are taking place (and round 2 beginning in November). The team aims to begin brainstorming what to do with the results of the data, as well as overall next steps for the Genetic Discrimination Observatory initiative within GA4GH.
The Passports team discussed successes, challenges, and next steps for the standard. The standard has been used successfully in single broker systems with some level of trust; and the next update, Passports v1.2, will aim to tighten up some ambiguity in the specification, with the overall goal of allowing loosely-coupled institutes in federated systems to share data. The next challenge is figuring out what is required to allow a system of multiple brokers across boundaries, which will require discussion and alignment of policy, governance, and tech. The team is collecting use cases to collect requirements for further technical design on this question.
LSG infrastructure & specification development processes
Members of the Large Scale Genomics Work Stream explored the technical and social aspects of specification development. The team began with an overview of technical procedures that are currently used to maintain both PDF and HTML specifications, and discussed implementing Architecture Design Records to help document and crystallize important decisions made in specification development. The team aims to continue these process discussions, with the goal of sharing these techniques that other Work Streams and groups can put into action.
Future of VCF
The VCF group from the Large Scale Genomics Work Stream reviewed and determined gaps within their landscape analysis, and narrowed down use cases that were high priority. The team aims to set things into motion to further establish the group and will focus on advancing work on simulated data and the requirements that it must fulfill.
Regulatory & Ethics Work Stream
The Regulatory & Ethics Work Stream (REWS) met to move forward current roadmap deliverables. The Data Access Committee Review Standards (DACReS) Policy was unanimously approved by the Work Steam, and will be brought to the next Steering Committee for GA4GH approval; and Consent Clauses for Large Scale Sequencing was unanimously approved to move forward to public comment. REWS also began brainstorming new areas to explore for their 2020-2021 roadmap, including diversity of datasets, benefit sharing, remote participation in research, approaches to consent for recontact, and communication around public/private collaborations.
Data Access Committee Review Standards (DACReS)
With the support of NIH funding, the DACReS team aims to focus on two areas of work: 1) Capacity building for procedural standardization, including testing and validating the policy through consultation workshops and pilot exercises; and 2) Explore the emergence of automated governance solutions in the data access management space. In terms of next steps, the team aims to use surveys and interviews to explore quality indicators of Data Access Committees and check interest in or concerns for adopting automated approaches. The group is open to other ideas and suggestions for further exploration.
The Genomic Knowledge Standards (GKS) Work Stream presented updates on their latest standards: Variation Representation Specification and Variant Annotation. For next steps, the Work Stream aims to focus on demonstrable applications and tools and provide solutions to the application of congruent variation.
FASP (Data Connect + Cloud)
The Data Connect and Cloud teams ran through a series of demos involving searching for COVID-19 and cancer datasets. The teams aim to continue aggregating use cases—particularly how Data Connect could be used for workflow-based use cases, such as monitoring the status of workflow runs. The team also aims to advance integration of Passports in Data Connect and explore federated use cases involving sharing data models across multiple sites.
Several Driver Projects, institutions, and organizations were invited to their current process for genomic testing in “ordinary” medical care, with the goal of gathering new insights in order to draft a model template for genomic research consent clauses for data sharing. A number of key challenges and barriers were identified, along with key considerations for implementing a dynamic consent system. Next, the team aims to collect consent clauses from the community and discuss whether they typify what is currently happening in the world of clinical genomics consent.
The EDI Advisory Group announced the launch of the Onboarding initiative, after a successful pilot program that paired four newcomers with four Work Stream Guides. Next, the team shared results from an EDI survey this year, which revealed that more than 50% of respondents were not as involved in GA4GH as they wanted to be. These individuals felt that a lack of time and support impeded their ability to engage with the community. With this in mind, the group held an interactive brainstorming session to determine the EDI Advisory Group’s next project. The group decided that their efforts would be best geared towards developing a Work Stream Best Practices document.
Data Model Library Feedback
The goal of the data model library is to design a platform that will encourage interoperability across specifications from different GA4GH Work Streams, encourage schema reuse for new standards, and provide stable endpoints for GA4GH schemas. The group discussed cross Work Stream feedback on requirements for a GA4GH-wide schema library, with the aim of reaching broad agreement on key requirements and features. Next, the group aims to reach a common understanding of the scope of the project, including what the data platform will provide for internal contributors as well as downstream consumers of data models, continue discussion of unresolved items, and discuss ideas for a new GA4GH API specification, the Data Model Registry Service.
Refget & Sequence Collections Updates
The refget and sequence collections groups presented brief summaries of five major topic areas with the goal of increasing participation. These include: the adoption of an array based format for defining sequence collections; sequence order within collections; compatibility functions and comparing collections; reverse lookup; and integration into other GA4GH products. The teams aim to carry on these conversations at regular meetings and develop and distribute better use cases for the reverse lookup.
The Security Workshop included a presentation on privacy-preserving federated analytics for personalized medicine, a solution that uses multi-party homomorphic encryption. MEdCo uses such a system to provide a distributed software platform for federated cohort exploration and analytics of clinical and genomic data. There are ongoing discussions with the ELIXIR Cloud & AAI Driver Project for the integration of MEdCo with the GA4GH Cloud APIs.
Clin/Pheno Driver Project Roundtable
The Clinical and Phenotypic Data Capture (Clin/Pheno) Work Stream held a roundtable to build a collaborative community across the GA4GH network, ensure that future Clin/Pheno deliverables can meet community needs, and receive input and feedback on the next Clin/Pheno roadmap. The Work Stream heard from many Driver Projects and groups, including: EUCANCan, EJP-RD, SPHN, SNOMED, HPO & Clinical Terminologies, CanDIG & EpiShare, Australian Genomics, ISO/TC215/SC1, and H3Africa.
This session provided the current landscape of AI/ML bias, beginning with examples of AI/ML bias, biases that can occur throughout the AI lifecycle, types of biases in the data, model evaluations, and the importance of building trust through transparency. Afterwards, the group conducted a discussion on opportunities for AI/ML bias standards within GA4GH. A new cross Work Stream initiative will be created to continue discussions and work in this space, and the team will aim to conduct a survey of AI/ML use across GA4GH.
The Sequence Annotation group reviewed the current draft of the Sequence Annotation model. The team discussed the current draft structure and relationships in the model, as well as implementing a controlled vocabulary for transcript types. The team is open to feedback!
Chief Standards Officer Susan Fairley recapped all twenty sessions from Connect and shared next steps and upcoming announcements from GA4GH.