How we work

Discovery vision statement

Read the 5-year vision statement of the work stream or read the full GA4GH Connect Strategic Plan.

Motivation and Mandate

We are in an era of abundant genomic information fueled by steadily decreasing sequencing and processing costs and service platforms that ease analysis. These critical resources are spread throughout the world and are increasingly challenging to aggregate for a multitude of reasons, including scale, regulatory differences, and data harmonization across information arising from diverse origins. We believe a solution to this challenge is to facilitate the discovery and utilization of these varied data sources and services via standard APIs and context-aware user interfaces. The Discovery Work Stream aims to create a unified data discovery platform to make it easier to find and use data, tools, and infrastructure for genomics and clinical analysis.

Existing Standards

Organizations such as the Matchmaker Exchange, the Beacon project, BRCA Exchange, and many others approach fragmented and diverse data sources by locally aggregating, harmonizing, and redistributing processed data through web-based user interfaces and standardized APIs. Unfortunately, each has its own data sharing formats and sharing nuances. These cause difficulties and inefficiencies to the consumer in gaining synergistic value by cross referencing and utilizing these invaluable resources. Further, diverse datasets arising from different sequencing and processing technologies as well as overlapping samples add to interpretation challenges.

Proposed Solution

The Discovery Work Stream proposes a unified interface that acts as a facade to a varied dynamic collection or registry of data sources and services, forming an interconnected ‘Internet of Genomics Data and Services.’ The network’s data sources and services can be crawled and indexed, exposing a single standardized API endpoint that a unified web interface can aggregate and present in a context-aware, meaningful manner. To achieve this, the Work Stream will design a suite of standards that :

  • are easy to implement with a community-maintained reference implementation.
  • reflect the context of the data that it shares.
  • reflect the nuances in data sharing preference.
  • leave room to include information from meta-sites, such as DUOS, to help with usage.