Canadian Genomics Cloud to develop GA4GH-compliant precision medicine platform

16 Feb 2018

The Canadian Genomics Cloud, a national cloud-based infrastructure for genomics data sharing, will develop an end-to-end software solution that complies with GA4GH standards from the ground up.

“Genomics will not stop with As, Cs, Ts, and Gs,” says Marc Fiume, Co-Lead of the GA4GH Discovery Work Stream and CEO of DNAStack. “We’re going to have to integrate phenotypic and clinical information and run machine learning to extract insights; physicians will need to make diagnoses and prescribe precision medications; pharma companies will need to use genomics to understand the efficacy of their drugs.”

All of this, Fiume says, requires not just the generation of genomics data, but also the ability to process it in the cloud, connect it to clinical records, and use those discoveries in a systematic way to design a holistic solution for applying genomic medicine in healthcare.

This week DNAstack together with Canada’s Genomics Enterprise (CGEN), fellow GA4GH Member Organizations Google and the Centre of Genomics and Policy, and others launched Canadian Genomics Cloud (CGC): a national cloud-based infrastructure for genomics initiatives to share data across Canada.

The hope, Fiume says, is that the CGC will lay the groundwork for a future national precision medicine initiative by demonstrating the readiness of the Canadian genomics ecosystem to bring together high-powered cloud and sequencing facilities and integrating their systems to enable facile, secure data sharing, discovery, and exchange.

To that end, the CGC will be developing all of its solutions according to GA4GH standards. “The software will build upon a set of principles, many of which are shared by GA4GH. For example, we need to do this at scale and we need data and methods to be shared in the cloud.” Tactically, he says, this will look like a suite of GA4GH and clinical application programming interfaces (API) that work together to allow data to transition from the sequencer to the scientist and ultimately to the clinician.

In particular, the effort will align itself with the standards developed by the GA4GH Cloud and Discovery Work Streams. “We plan to continue to follow the trajectory of those new Work Streams, expecting that it will take about a year for new APIs to be stable and to have mature implementations at CGC.”

As announced earlier this week in the 2018 Strategic Roadmap, the Cloud Work Stream plans to develop a set of cohesive, interoperable APIs for virtually storing, analysing, and sharing data. The Tool Registry Service (TRS), Workflow Execution Service (WES), Data Object Service (DOS), and Task Execution Service (TES) APIs are designed to work together to allow researchers at disparate institutions to bring their analyses to data stored in the cloud rather than transferring these large datasets between institutions or around the globe.

The result, says the team, is “highly portable analysis code that ultimately enables ‘FAIR’ science, e.g. findable, accessible, interoperable, and reproducible tools, workflows, and datasets.”

The Discovery Work Stream is also working to make genomics data comply with the “FAIR” principles, however it is focused on developing APIs that make it possible for researchers at one institution to learn about data at a different institution. Together, the standards put forth by the Discovery Work Stream are intended to enable a global federated network of searchable genomic information.

As part of its mandate, the CGC will enable a national directory of shared data, which will build upon the standards of the Discovery Work Stream. “We’re keen to establish a mechanism for researchers to share data to make it analysable in different organizations,” said Fiume. “Data from Toronto should be analysable in Vancouver and vice versa. The CGC, built on top of GA4GH standards, will help to better connect scientists to data, tools, and each other, to help realize what we expect to be a watershed moment where these connections power systematic discoveries.” The CGC will also allow for collaboratories of disparate organizations to share data in a way that is both straightforward, secure, and transparent.

Latest News

HDR UK and GA4GH strategic partnership
16 Apr 2024
HDR UK and GA4GH form a strategic partnership to unite genomic and health data
See more
Neon DNA strands intertwining with digital code symbolising the fusion of biology and technology
11 Apr 2024
GDPR Brief: when are synthetic health data personal data?
See more
6 Mar 2024
Putting GA4GH standards into practice: Mallory Freeberg and Alastair Thomson to lead GA4GH Implementation Forum
See more