19 October 2020
Members of the Global Alliance for Genomics in Health (GA4GH) Work Streams have the Search API V1 and the Task Execution Service (TES) API V1 for approval, commencing with a 30-day public comment period.
Developed by the Discovery Work Stream, the GA4GH Search API V1 provides a simple, uniform mechanism to publish, discover, query, and analyze biomedical data, any “rectangular” data that fits into rows and columns. The API is composed of two principal components: a “tables” API that exposes structured tabular data, and a “query” API that supports SQL queries over data. It is intentionally general-purpose and minimal. It does not prescribe a particular backend implementation or a data model, and supports federation by design. Public implementations or deployments of the Search API have been set up to expose data from the MSSNG database, a subset of the dbGAP GECCO database (which contains data collected by the Genetics & Epidemiology of Colorectal Cancer Consortium), and the Canadian COVID Cloud portal.
Developed by the Cloud Work Stream, the TES API V1 provides a standardized API for requesting asynchronous batch jobs. Under the hood, the API may wrap other batch compute systems, such as SLURM or AWS batch. The user provides a description of the task, including manifests of input and output files, as well as a container image name and command line. The TES server then manages the deployment, scheduling, running and cleanup of the job while providing status and logging information through the API. Implementations or deployments of the TES API have been set up at the Broad Institute, EMBL’s European Bioinformatics Institute (EMBL-EBI), and Oregon Health Sciences University (OHSU).