Genomic Data Toolkit


Access and adopt ready-to-use Genomic Data for genomic data sharing below or download the full 5-year GA4GH Connect GA4GH Connect Strategic Plan.

htsget API v1

htsget is a genomic data retrieval specification that allows users to download read data for subsections of the genome in which they are interested. Currently, users must download the whole set of files in which that data resides, a slow, resource-intense process.

contributors

Available resources

Workflow Execution Service (WES) API V1

Portable tools — the ability to execute a single analysis in a variety of environments — allow researchers to work with more data from more sources, and tool builders to support more researchers and more use cases. The Workflow Execution Service (WES) API provides a standard for exactly that. This API lets users run a single workflow (defined using CWL or WDL) on multiple different platforms, clouds, and environments, and be confident that it will work the same way. The API provides methods to request that a workflow be run, pass parameters to that workflow, get information about running workflows, and cancel a running workflow.

Available resources

refget API V1

All sequencing-based genomic analysis uses a genomic “reference sequence” — a baseline of knowledge against which variations are observed. There are multiple human reference sequences of increasing accuracy and different organizations refer to the same sequence using different names or reuse names to refer to different reference releases. Reliable, reproducible genomic analysis depends on clear provenance back to reference data. The GA4GH refget API enables access to reference genomic sequences without ambiguity from different databases and servers using a checksum identifier based on the sequence content itself.

Available resources

Beacon API V1

The Beacon API can be implemented as a web-accessible service that users may query for information about a specific allele. A user of a Beacon can pose the query “Have you observed this nucleotide (e.g. C) at this genomic location (e.g. position 32,936,732 on chromosome 13)?” to which the Beacon responds with either “yes” or “no”. The new release of the Beacon API extends its functionality through support for additional types of genomic variants and improved metadata support. Additionally, the accompanying ELIXIR Beacon reference implementation demonstrates ELIXIR Authorization and Authentication Infrastructure (AAI), enabling data owners to light Beacons at different tiers of data access: public, registered, or controlled.

Available resources

CRAM File Format v3

The CRAM file format is an efficient storage format for read data, achieving significantly better lossless compression than BAM, whilst maintaining full compatibility.

contributors

Available resources

Family History Tools Inventory

The Family History Tool Inventory is a catalogue of family history tools currently available for documenting family health history information. The Statement of Best Practice highlights current approaches and challenges in enabling family history to guide clinical care to developers of clinically-oriented family history collection systems, including stand alone and EHR-integrated systems. The inventory will be updated periodically and we encourage recommendations of other tools to include. To recommend a tool, please email info@ga4gh.org.

Contributors

SAM/BAM File Format v1

Specifications for storing next-generation sequencing read data.

contributors

Available resources

VCF v4.3 / BCF v2.1 File Format

The specifications for Variant Call Format Files (VCF) and its binary counterpart BCF.

contributors

Available resources

Genomics API (Retired)

The Genomics API was intended to act as a suite of integrated APIs each targeting a different aspect of exchanging genomic information between data providers and consumers. The Genomics API, together with the Reference Server and Compatibility test suite, was retired on January 24, 2018 and several of the sub-APIs are now being pursued under the auspices of new GA4GH Work Streams. You may still fork this repository if you wish to pursue developments. You may read the meeting minutes of the GA4GH Engineering Committee to learn more about the decision to retire the API. For additional questions or to get involved with ongoing technical work at GA4GH, please email rishi.nag@ga4gh.org.

contributors

Available resources