Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Our Strategic Road Map defines strategies, standards, and policy frameworks to support responsible global use of genomic and related health data.
Discover how a meeting of 50 leaders in genomics and medicine led to an alliance uniting more than 5,000 individuals and organisations to benefit human health.
GA4GH Inc. is a not-for-profit organisation that supports the global GA4GH community.
To guide our collaborative, globe-spanning alliance, GA4GH relies on a Standards Steering Committee and an Executive Committee.
The Funders Forum brings together organisations that offer both financial support and strategic guidance.
The EDI Advisory Group responds to issues raised in the GA4GH community, finding equitable, inclusive ways to build products that benefit diverse groups.
Distributed across four Host Institutions, our staff team supports the mission and operations of GA4GH.
Curious who we are? Meet the people and organisations across six continents who make up GA4GH.
More than 500 organisations connected to genomics — in healthcare, research, patient advocacy, industry, and beyond — have signed onto the mission and vision of GA4GH as Organisational Members.
These core Organisational Members are genomic data initiatives that have committed resources to guide GA4GH work and pilot our products.
This subset of Organisational Members whose networks or infrastructure align with GA4GH priorities has made a long-term commitment to engaging with our community.
Local and national organisations assign experts to spend at least 30% of their time building GA4GH products.
Anyone working in genomics and related fields is invited to participate in our inclusive community by creating and using new products.
Wondering what GA4GH does? Learn how we find and overcome challenges to expanding responsible genomic data use for the benefit of human health.
Study Groups define needs. Participants survey the landscape of the genomics and health community and determine whether GA4GH can help.
Work Streams create products. Community members join together to develop technical standards, policy frameworks, and policy tools that overcome hurdles to international genomic data use.
GIF solves problems. Organisations in the forum pilot GA4GH products in real-world situations. Along the way, they troubleshoot products, suggest updates, and flag additional needs.
NIF finds challenges and opportunities in genomics at a global scale. National programmes meet to share best practices, avoid incompatabilities, and help translate genomics into benefits for human health.
Communities of Interest find challenges and opportunities in areas such as rare disease, cancer, and infectious disease. Participants pinpoint real-world problems that would benefit from broad data use.
See all our products — always free and open-source. Do you work on cloud genomics, data discovery, user access, data security or regulatory policy and ethics? Need to represent genomic, phenotypic, or clinical data? We’ve got a solution for you.
All GA4GH standards, frameworks, and tools follow the Product Development and Approval Process before being officially adopted.
Learn how other organisations have implemented GA4GH products to solve real-world problems.
Help us transform the future of genomic data use! See how GA4GH can benefit you — whether you’re using our products, writing our standards, subscribing to a newsletter, or more.
Help create new global standards and frameworks for responsible genomic data use.
Align your organisation with the GA4GH mission and vision.
Solve your real-world data problems with support from this valuable network of global institutions.
Work with like-minded groups committed to better data use in areas like rare disease, cancer, and infectious disease.
Share your thoughts on all GA4GH products currently open for public comment.
Solve real problems by aligning your organisation with the world’s genomics standards. We offer software dvelopers both customisable and out-of-the-box solutions to help you get started.
Learn more about upcoming GA4GH events. See reports and recordings from our past events.
Speak directly to the global genomics and health community while supporting GA4GH strategy.
Be the first to hear about the latest GA4GH products, upcoming meetings, new initiatives, and more.
Questions? We would love to hear from you.
Read news, stories, and insights from the forefront of genomic and clinical data use.
Attend an upcoming GA4GH event, or view meeting reports from past events.
See new projects, updates, and calls for support from the Work Streams.
Read academic papers coauthored by GA4GH contributors.
Listen to our podcast OmicsXchange, featuring discussions from leaders in the world of genomics, health, and data sharing.
Check out our videos, then subscribe to our YouTube channel for more content.
View the latest GA4GH updates, Genomics and Health News, Implementation Notes, GDPR Briefs, and more.
Discover all things GA4GH: explore our news, events, videos, podcasts, announcements, publications, and newsletters.
9 Mar 2021
Recently approved by the GA4GH Standards Steering Committee, the Task Execution Service (TES) API v1 provides a standard mechanism for orchestrating these complex analyses across different compute environments.
Federated analysis of data distributed across the world can make genomics research more powerful by connecting multiple large-scale datasets for simultaneous analysis.
Such investigations utilize complex methods, such as aligning multiple sequences to the human reference genome to identify potentially pathogenic variants. These analyses often involve up to hundreds of thousands of computational tasks, which can take considerable time and compute power to execute. Recently approved by the GA4GH Standards Steering Committee, the Task Execution Service (TES) API v1 provides a standard mechanism for orchestrating these complex analyses across different compute environments.
To support large scale federated analysis, institutions and organizations employ queueing systems that send tasks out to high performance computers (HPCs) or cloud environments—but each compute system is unique, and each cloud vendor uses incompatible APIs for running batch tasks. Because of these discrepancies, researchers carrying out federated analysis must employ unique code for each.
The TES API adds to the suite of standards produced by the GA4GH Cloud Work Stream, whose mission is to help the genomics and health community take full advantage of modern cloud environments by bringing algorithms to data that cannot be moved due to various regulatory limitations.
“By building analysis software with the TES API, researchers can quickly move from a university cluster, to Amazon, to Microsoft Azure, without changing their code,” said Kyle Ellrott, Assistant Professor of Computational Biology at Oregon Health & Science University and co-lead of the TES API development team. “With the TES API, moving large-scale batch computing between private computers and the cloud becomes seamless.”
On the backend, the TES API wraps around an institution’s HPC system or cloud environment, and then manages the deployment, scheduling, running, and clean-up of tasks while providing status updates and logging information back to the researcher.
For example, if a researcher is running genomic analysis pipelines, they may send out a thousand task requests, usually with the help of a workflow engine. The workflow engine, which may be custom made or from an existing software project, needs a way to talk to the local compute resources. The TES server accepts the requests, communicates with the local job queuing system, and tracks progress and output. This is all done in a single API that looks the same no matter what infrastructure manages the computational resources. Thus, the TES API provides a flexible and standardized approach to connect complex workflow engines to new compute systems—saving time and resources.
Furthermore, the TES API can help extend systems that provide the Workflow Execution Service (WES) API, another GA4GH Cloud standard. While the WES API orchestrates a series of steps in a workflow, the TES API can connect the workflow to a compute backend to execute specific steps. So when a researcher takes their WES-enabled workflow engine to a new computational environment, they can plug into the local TES API without having to write new adaptors.
“This concept of pluggable compute backends is key to the TES API,” said Ania Niewielska, Lead Software Engineer at EMBL-EBI and co-lead of the TES API development team. “Since many existing workflow engines have already implemented the TES API, adding support for a new compute backend, such as a new cloud provider, can be achieved through a single TES implementation—instead of writing separate implementations for each workflow engine. Additionally, TES backends can be implemented in the technology of choice, independent from the tech stack used for the workflow engines.”
“The European life science infrastructure is very fragmented,” said Alexander Kanitz, co-lead of the ELIXIR Cloud project, a GA4GH Driver Project. “The TES API offers a means to abstract over different compute backends in an effort to federate the execution of computational workflows across the various nodes, from hospitals to research centers. This is one of the key reasons why we chose to implement the TES API.”
Joris Vankerschaver, Manager of Strategic Technologies and Life Science Solutions at Enthought said, “The TES API allows us to ‘code against TES’ rather than against a particular environment. This is important when working with clients who may have a mixture of on-premise servers and cloud resources, or who are interested in moving to the cloud.”
The TES API was also designed with real-world constraints in mind. “The complexity of moving health-generated data for secondary research purposes is immense, due to patient privacy, security, and legal considerations,” said Leslie Glass, Project Manager at EMBL-EBI where she leads the CINECA project. “We chose to implement the TES API to help ensure that these data do not become siloed and inaccessible for research.”
Many workflow engines, including Cromwell, Nextflow, and Snakemake, have already begun to support the TES API. In the future, the team plans to expand support for the API and to focus on compatibility with other GA4GH standards, including the Data Repository Service (DRS) API and the GA4GH Passports Specification to manage authentication and authorization.