Every two years, GA4GH executes a gap analysis to identify areas of need that are not being adequately addressed by existing or planned GA4GH products, as well as feedback on current GA4GH activities and methods. The GA4GH Gap Analysis aims to (i) identify gaps that may not be identified through the Work Stream Roadmap process, (ii) identify new opportunities for GA4GH, (iii) improve internal and external interoperability and collaboration, and (iv) improve uptake and usage.
The gap analysis process includes round table discussions with groups of GA4GH Driver Project Champions and Work Stream Leads as well as a community survey. The 2020 gap analysis was led by Heidi Rehm (MGH/Broad Institute) and Andrew Morris (HDR UK), and consisted of 14 round table discussions and 59 survey responses (77% from academia, 15% from industry, and 7% from healthcare). From this work, the GA4GH community identified three strategic imperatives for GA4GH to focus on over the next two years: (i) improve interoperability and alignment with external standards and between GA4GH standards, (ii) improve implementation support for technical standards, and (iii) engage more closely with healthcare and clinical standards.
To effectively drive uptake, GA4GH must demonstrate how our standards work together and allow seamless support of genomic activities. . We must ensure our teams develop an interconnected suite of standards that are compatible and interoperable with each other and hardened for real-world use. We will identify alignment opportunities between GA4GH standards and support a centralized forum for discussing all ongoing GA4GH technical details.
Truly interoperable standards will enable solutions that can encompass multiple components of a pipeline and multiple platforms and use cases.
Federated Analysis Systems Project (FASP)
FASP was established by GA4GH to show that GA4GH APIs, when used in concert, can facilitate real-world, scientific use cases by conducting genomic analysis in the cloud. FASP aims to simulate how a researcher would search, access, and analyze genomic data within the GA4GH ecosystem via end-to-end test scenarios involving multiple Driver Projects. Implementations of multiple GA4GH API specifications are planned as part of FASP, including Search, Passports, DRS, TRS, and WES. Tests will be run via the testbed infrastructure against a wide variety of web service implementations, showing that common API specifications facilitate interoperability. This work will involve the development of a comprehensive list of scientific use cases, as well as new web services for the test scenarios to run against. The FASP test scenarios will illustrate how having GA4GH standards solve a spectrum of challenges across the search-access-analyze workflow (e.g., datasets not discoverable, barriers to data access), driving larger scale and more powerful analyses.
Technical Alignment Sub-Committee (TASC)
TASC serves as a central decision-making group, including the documentation and communication of these decisions across multiple stakeholders. The TASC primarily functions to create consistency and technical alignment across the GA4GH Work Streams. For instance, the group may make internal technical recommendations that impact multiple Work Streams (e.g., best practices for namespacing, GitHub usage, common data elements), both in response to requests from Work Streams, as well as independently as proactive initiatives. The TASC also maintains and distributes technical resources to facilitate external alignment (e.g., list of technical contacts at other standards organizations). The TASC Team does not act as a product review gatekeeper, provide extensive engineering support to Work Streams, or define products.
SchemaBlocks is a “cross-workstreams, cross-drivers” initiative to document GA4GH object standards and prototypes, as well as common data formats and semantics. Launched in December 2018, this community initiative aligns documentation and implementation examples provided by GA4GH members. While future products and implementations may be completely based on SchemaBlocks components, this project does not attempt to develop a rigid, complete data schema but rather to provide the object vocabulary and semantics for a large range of developments.
Implementations of standards, particularly those that serve the high priority needs of the community, are critical to inform development of and harden the standard, ensuring that it can solve a real-world problem. Implementations also serve to instantiate a standard by bringing awareness to the standards as well as forcing adherence through the need to enable downstream and interconnected functions.
Driver projects and the expanding community are able to quickly adopt and implement GA4GH standards, driving broad uptake and subsequent interoperability across the community.
We will provide training and support to developers working to implement a GA4GH standard, as described below. User support for researchers and clinicians interacting with those implementations remains the job of the individual platform developers.
Federated Analysis Systems Project
While FASP was established to support interoperability between standards, it will also indirectly support implementation of those standards around the globe. As groups actively work to become interoperable, they will naturally reveal pain points to implementation. This forum ensures specification conformance and a consistent approach to implementation. Additionally, the exemplar implementations that come out of FASP can be copied by other groups who may not have the capacity to start from scratch. Finally, FASP, which includes both small players and powerhouse genomics organizations, increases the likelihood of others seeing immediate benefit of implementation and demonstrates the value of working together to accomplish real world scenarios.
The GA4GH ecosystem of genomic standards will continue to expand as new standards and versions are released. To ensure we remain effective in disseminating our outputs clearly and comprehensively, the emerging GA4GH staff technical team will develop tools and web services to prepare and serve standards documentation. Automated documentation generation tools will consolidate API specification documentation in a single location and under a common style, enabling users to quickly access standards without having to hunt for them. Automated documentation will shorten development times, and ensure that documentation always accurately reflects the original standard. The Documentation Hub will also point to standards on GitHub and link out to different implementations that we track and endorse. This will allow researchers to browse standards and implementations and search for deployments. We will also publish how-to guides explaining how to accomplish certain use cases based on the scenarios outlined in FASP, for example, how to detect somatic mutations on cancer datasets using GA4GH services in the cloud. These guides will be hosted on our website.
Accessing GA4GH Resources Online
The GA4GH website (www.ga4gh.org) aims to present all GA4GH standards and frameworks in an accessible, easy-to-digest manner while also generating additional engagement between the Work Streams and potential technical contributors. All Work Stream meeting minutes and developer repositories are openly linked on the website, along with upcoming event information, conference reports, academic publications, information on becoming an organizational member and details about existing members, governance structure, and more. The website is hosted, developed, and maintained through in-kind support from the Wellcome Sanger Institute. Nearly 48,600 unique visitors explored the GA4GH website in 2019. We are also in the process of hiring a full time web developer who will ensure that the website remains up to date, is accessible in multiple languages, encourages broad participation, and makes it clear how GA4GH standards can work alone or in concert to benefit platform users (e.g., researchers and clinicians) and ultimately improve patient outcomes.
In-person Training Seminars
Live training seminars will provide guidance to engineers setting up systems that process genomic data using GA4GH specifications. Seminars will be developed as a series of modules, each focusing on a subset of GA4GH standards. In this way, we will be able to adapt the program to meet the specific needs of diverse audiences. Creating the practical dimensions of the courses, updating and adding modules as exposure grows, and developing supporting documentation will require significant developer time. The first iteration of the training program will take place in 2020 and will focus on the theme of ‘Setting up a Federatable Genomic Data Centre.’ Using the suite of GA4GH standards, this first seminar will demonstrate how to store and process sequenced patient data using standard pipelines, and make the data available in a manner respecting patient consent
GA4GH virtual webinars are free and open to the public and aim to promote uptake of approved GA4GH deliverables. Webinars consist of presentations from the relevant developers as well as user experiences from Driver Projects that have already developed an implementation. Presentations are followed by question-and-answer periods during which anyone from the community who wishes may participate. Recordings of the events are made available on our website alongside other documentation about the relevant standard to help support other platform developers as they seek to implement.
The GA4GH ELIXIR Maturity Model
We are currently developing a maturity model (MM) with the ELIXIR genomic data projects with a focus on the development and implementation of GA4GH standards to enhance health-related data access throughout Europe. Ultimately, this model will be expanded to support GA4GH’s global community to achieve the same goal. This ELIXIR::GA4GH Strategic Partnership Maturity Model consists of three parts.
Together these tools will provide the information needed to progress to the next level; ultimately aiding the ELIXIR network and the broader community in expanding its sensitive human data access framework through the inclusion of GA4GH standards.
GA4GH technical standards and policy frameworks aim to support a “learning health system” in which secondary use of patient data feeds into research, and the learnings from research reciprocally inform medical care. Historically, GA4GH has been well-connected to the research side of this virtuous cycle, however the diversity of the global clinical community has limited our ability to interface with the healthcare side. For GA4GH standards to be truly effective and for the organization to achieve its mission, we must overcome this limitation, which stems from a variety of origins, including, (i) the diversity of stakeholders within the healthcare community, (ii) the limited resources in healthcare to support research engagement, (iii) the regulatory need for locked down and standardized solutions means healthcare often buys rather than builds tools, and (iv) difficulty in finding the right points of engagement (eg., vendors/industry vs. clinicians).
Ability to efficiently and effectively respond to the needs of clinical stakeholders, developing standards that support the healthcare industry
Genomics in Health Implementation Forum
GA4GH engagement efforts in areas impacting precision health must be high level, allowing individual initiatives to implement standards in a manner appropriate for their local context. With proactive engagement at a regional and organizational level, GA4GH aims to ensure that its standards are easily accessible and meet the disparate needs of the global community. In order to strengthen international collaboration between national genomic initiatives, GA4GH has recently formed the Genomics in Health Implementation Forum (GHIF) to support the implementation of GA4GH interoperability standards and frameworks as well as to identify new use cases that require GA4GH’s attention. The GHIF builds on past activities of a subset of GA4GH Driver Projects—led by Australian Genomics and Genomics England—to convene thought leaders and domain experts from more than two dozen national and continent-wide genomics initiatives to promote knowledge exchange and collaboration as they pursue the common goal of advancing human health.
Healthcare and Clinical Advisory Group
While GHIF represents the core strategy for linking GA4GH to the world’s disparate healthcare communities, its scope does not include clinical groups such as diagnostic laboratories, specialist clinicians, electronic health record vendors, and more. GA4GH will launch a Clinical Advisory Group to help ensure that GA4GH is connecting in all the right ways and with all of the right groups to effectively and comprehensively engage the complex international clinical community. This group will have five key goals: (i) identify clinically-relevant areas of focus missing from the current GA4GH roadmap, (ii) inform GA4GH standards to ensure they can be used in clinical settings, (iii) identify implementation opportunities within the clinical domain, (iv) create region- and sector-specific engagement strategies, and (v) align GA4GH’s development activities with those of other clinically-focused SDOs (ie. CDISC, HL7). The group will be a loose affiliation of leaders and key stakeholders representing the diversity of the healthcare sector; It will not be a formal entity and will not have official leadership. This group will meet regularly and provide feedback to GA4GH executive committee on specific topics around its clinical engagement strategy.
Cross Standard Development Organizations Consortium (xSDO)
Like GA4GH, the International Organization for Standardization (ISO) and Health Level 7 (HL7) are international standards development organizations (SDOs) that actively develop standards related to genomics. Without intentional coordination to keep our respective products aligned, there is a risk of unnecessary proliferation of redundant standards, as well as the development of semantically- and syntactically-conflicting standards that will hamper large scale interoperability and introduce confusion within the adopter community. To mitigate this risk, GA4GH has committed—through its participation in the nascent Cross SDO (xSDO) consortium— to coordinate its activities and future roadmaps with those of other SDOs, including ISO Technical Committee 215 (ISO/TC215) for Health Informatics and HL7 Clinical Genomics (CG). This proactive coordination will help to ensure international coordination of genomics standards, particularly between Asia, Europe, and North America. In particular, the GA4GH Phenopackets standard has been approved as a work item in the programme for ISO/TC 215’s new Sub Committee: Genomics Informatics. This work will increase the availability of standardized phenotypic information and expand the collection of use cases to develop a standard relevant to genomics communities internationally.
Phenopackets HL7 Fast Health Interoperability Resources (FHIR) Implementation Guide
GA4GH has been awarded a contract from the NIH National Library of Medicine (NLM) to re-develop the GA4GH-approved standard Phenopackets as an HL7 Fast Health Interoperability Resources (FHIR) Implementation Guide. FHIR is a standardized way of transmitting health data from one health information system to another through an application programming interface (API). The Implementation Guide will provide a set of rules for using FHIR resources to exchange phenotypic information. This work aims to (i) increase the availability of standardized phenotypic information, (ii) ensure that the use of FHIR for the exchange of clinical data meets the needs of genomics researchers and genomic medicine, and (iii) improve methods for clinical researchers to use electronic health records (EHRs) and other clinical data for medical research. This will enable the EHR community to extract, assimilate, and exchange genomic information from EHRs in a standard, efficient, and accurate fashion.