How we work

Genomic Knowledge Standards vision statement

Read the 5-year vision statement of the work stream or read the full GA4GH Connect Strategic Plan.

Motivation and Mandate

Genomic data analysis and interpretation is at the heart of enabling genomic data to improve human health. Many developed analyses require locating interesting and potentially causative changes in genomic sequence before attempting to categorize, rank, and prioritise potential leads by intersecting patient data with known reference data sets. All analysis methods develop their own solutions to access reference genomic sequence, find and use baseline reference genomic annotation (e.g. genes, variations, regulatory regions, expression), integrate and find equivalence with other resources, model data, and distribute the results of said analysis to downstream consumers—be they human or computational. In addition, the provenance of annotation can be unclear and associated metadata may be unstructured. Results may not be directly comparable between two resources due to ambiguity in data representation, semantics, and provenance.

Existing Standards

VMC (Variation Modelling Collaboration) is a specification, now at version 0.1, for modelling simple variation and was developed by members of the Variation Annotation Task Team (VATT). FHIR (Fast Healthcare Interoperability Resources) is a specification to enable the transfer of healthcare information over standard APIs. In addition a number of GA4GH standards for modelling ontologies, genomic annotation and RNA quantification have been developed as part of the schema/reference/compliance suite of applications.

Proposed Solution

The Genomic Knowledge Standards Work Stream (GKSWS) aims to develop, adopt, and adapt standards-based components to enable the exchange of reference genomic information through common APIs, thereby enabling the downstream analysis of genomic data. It will focus on developing specifications related to genomic sequence, annotation, and associated metadata/provenance.

GKSWS will engage with GA4GH Driver Projects, including analysis tool developers/consumers (VICC, GEL) and reference data providers (ClinGen, Ensembl), to ensure that standards-based solutions to data access and exchange are developed based on real-world use-cases whilst also being applicable to more generalized scenarios. GKS will work closely with other GA4GH Work Streams (Large Scale Genomics, Discovery) in areas of common interest to move standards into production (VMC), and we will partner with external standards development organizations to leverage existing specifications and to ensure GKSWS-developed standards are suitable to healthcare environments (HL7, FHIR).