Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Our Strategic Road Map defines strategies, standards, and policy frameworks to support responsible global use of genomic and related health data.
Discover how a meeting of 50 leaders in genomics and medicine led to an alliance uniting more than 5,000 individuals and organisations to benefit human health.
GA4GH Inc. is a not-for-profit organisation that supports the global GA4GH community.
To guide our collaborative, globe-spanning alliance, GA4GH relies on a Standards Steering Committee and an Executive Committee.
The Funders Forum brings together organisations that offer both financial support and strategic guidance.
The EDI Advisory Group responds to issues raised in the GA4GH community, finding equitable, inclusive ways to build products that benefit diverse groups.
Distributed across four Host Institutions, our staff team supports the mission and operations of GA4GH.
Curious who we are? Meet the people and organisations across six continents who make up GA4GH.
More than 500 organisations connected to genomics — in healthcare, research, patient advocacy, industry, and beyond — have signed onto the mission and vision of GA4GH as Organisational Members.
These core Organisational Members are genomic data initiatives that have committed resources to guide GA4GH work and pilot our products.
This subset of Organisational Members whose networks or infrastructure align with GA4GH priorities has made a long-term commitment to engaging with our community.
Local and national organisations assign experts to spend at least 30% of their time building GA4GH products.
Anyone working in genomics and related fields is invited to participate in our inclusive community by creating and using new products.
Wondering what GA4GH does? Learn how we find and overcome challenges to expanding responsible genomic data use for the benefit of human health.
Study Groups define needs. Participants survey the landscape of the genomics and health community and determine whether GA4GH can help.
Work Streams create products. Community members join together to develop technical standards, policy frameworks, and policy tools that overcome hurdles to international genomic data use.
GIF solves problems. Organisations in the forum pilot GA4GH products in real-world situations. Along the way, they troubleshoot products, suggest updates, and flag additional needs.
NIF finds challenges and opportunities in genomics at a global scale. National programmes meet to share best practices, avoid incompatabilities, and help translate genomics into benefits for human health.
Communities of Interest find challenges and opportunities in areas such as rare disease, cancer, and infectious disease. Participants pinpoint real-world problems that would benefit from broad data use.
See all our products — always free and open-source. Do you work on cloud genomics, data discovery, user access, data security or regulatory policy and ethics? Need to represent genomic, phenotypic, or clinical data? We’ve got a solution for you.
All GA4GH standards, frameworks, and tools follow the Product Development and Approval Process before being officially adopted.
Learn how other organisations have implemented GA4GH products to solve real-world problems.
Help us transform the future of genomic data use! See how GA4GH can benefit you — whether you’re using our products, writing our standards, subscribing to a newsletter, or more.
Help create new global standards and frameworks for responsible genomic data use.
Align your organisation with the GA4GH mission and vision.
Solve your real-world data problems with support from this valuable network of global institutions.
Work with like-minded groups committed to better data use in areas like rare disease, cancer, and infectious disease.
Share your thoughts on all GA4GH products currently open for public comment.
Solve real problems by aligning your organisation with the world’s genomics standards. We offer software dvelopers both customisable and out-of-the-box solutions to help you get started.
Learn more about upcoming GA4GH events. See reports and recordings from our past events.
Speak directly to the global genomics and health community while supporting GA4GH strategy.
Be the first to hear about the latest GA4GH products, upcoming meetings, new initiatives, and more.
Questions? We would love to hear from you.
Read news, stories, and insights from the forefront of genomic and clinical data use.
Attend an upcoming GA4GH event, or view meeting reports from past events.
See new projects, updates, and calls for support from the Work Streams.
Read academic papers coauthored by GA4GH contributors.
Listen to our podcast OmicsXchange, featuring discussions from leaders in the world of genomics, health, and data sharing.
Check out our videos, then subscribe to our YouTube channel for more content.
View the latest GA4GH updates, Genomics and Health News, Implementation Notes, GDPR Briefs, and more.
Discover all things GA4GH: explore our news, events, videos, podcasts, announcements, publications, and newsletters.
4 Mar 2014
The first GA4GH Partner Meeting was hosted by the Wellcome Trust and provided an opportunity for GA4GH partners to reflect on progress and set concrete goals and deliverables to accomplish in the near term. Over 180 individuals participated in the plenary meeting, representing 100 organizations active in 30 countries. In addition, each of the four initial Working Groups convened a full-day meeting immediately before or after the plenary session, providing in-depth discussion of specific topics. All of the meetings were highly interactive and played a critical role in informing the next steps of GA4GH.
The Global Alliance for Genomics and Health (Global Alliance), formed in 2013, brings together leading international organizations working in healthcare, biomedical research, disease and patient advocacy, life science, and information technology. The partners in the Global Alliance are working together to create a common framework of harmonized approaches – seeking best practices where they exist, and developing new approaches where needed – to enable the responsible, secure, and effective sharing of genomic and clinical data.
Since drafting an initial White Paper and announcing the formation of the Global Alliance in June 2013, the Alliance has made substantial effort to engage the community and progress towards understanding the existing landscape. As of April 2014, over 175 organizations have joined the Global Alliance, and four Working Groups have been established and are already working to accelerate progress in the areas of: (a) genomic data, (b) clinical data, (c) security and privacy, and (d) ethics and regulation. This report provides an update on the activities of the Global Alliance leading up to and following the first face-to-face meeting held in March 2014.
The world has changed considerably since the human genome was sequenced. The cost of genome sequencing has fallen almost one-million fold, making it possible to systematically characterize the role of genomic variation, inherited and acquired, for its relationship to human biology and disease, enabling both research and clinical practice. In parallel, changes in computing – global sharing of information through the internet, cloud computing, and interoperability among data sets – make possible greater sharing of and learning from data. The enormous volume of genomic data available and likely to come has the potential to improve patient care in many ways, such as fast-tracking diagnosis and disease gene discovery, targeting therapies based on genetic subtype, tracking new outbreaks, or identifying new markers of antimicrobial resistance.
Yet while sharing genomic data is widely acknowledged to be necessary to drive major scientific leaps in our understanding of disease, there are currently significant challenges that limit progress. The ad hoc use of different data formats and technologies in different systems, lack of alignment between approaches to ethics and national legislation across jurisdictions, and the challenges of devising secure systems for controlled sharing of data are some of the barriers that prompted the creation of the Global Alliance.
These challenges are what the Global Alliance seeks to address through plenary meetings and Working Groups, with the aim of enabling the responsible sharing of genomic and clinical data. The first meeting of partners and stakeholders in March 2014 at the Wellcome Trust was vital in providing critical input into the Alliance’s activities, priorities, and future deliverables.
The first Global Alliance Partner Meeting was hosted by the Wellcome Trust in London on March 4, 2014 and provided an opportunity for Alliance partners to reflect on progress and set concrete goals and deliverables to accomplish in the near term. Over 180 individuals participated in the plenary meeting, representing 100 organizations active in 30 countries. In addition, each of the four initial Working Groups convened a full-day meeting immediately before or after the plenary session, providing in-depth discussion of specific topics. All of the meetings were highly interactive and played a critical role in informing the next steps of the Global Alliance.
Presentation slides are available on the Global Alliance website at: http://genomicsandhealth.org/news-events/events/march-4th-meeting-presentations.
The plenary session began by reviewing the mission of the Global Alliance and its mode of working. Specifically, the mission of the Global Alliance is: “To accelerate progress in human health by helping to establish a common framework of harmonized approaches to enable effective and responsible sharing of genomic and clinical data, and by catalyzing data sharing projects that drive and demonstrate the value of data sharing.” To achieve this mission, the Alliance will “convene stakeholders, catalyze sharing of data, create harmonized approaches, act as a clearinghouse, foster innovation, and promote responsible data sharing.”
Following this introductory overview, there were updates from each of the four Working Groups, interspersed with presentations from several existing data sharing initiatives by partner organizations, providing concrete examples of data sharing activities. These examples included: The International Cancer Genome Consortium; Melbourne Genomics Health Alliance; Genome Matchmaker (making it possible for labs studying rare genotypes and phenotypes to query one another for the existence of matching data); ELIXIR (a pan-European research infrastructure for biological information); and the Baylor College of Medicine Human Genome Sequencing Center.
After these presentations, breakout sessions that involved all meeting participants focused on identifying the top priority in each Working Group area and metrics for success. Finally, the group returned to plenary session to review progress over the course of the day, to review the priorities from the breakout groups, and to plan next steps.
The interactions at the plenary and satellites meetings were intended to provide Global Alliance partners the opportunity to share progress, offer feedback, develop new ideas, and arrive at a shared understanding of the key priorities for future action.
Based on the discussion during the March 4th Partner Meeting, a number of key themes emerged that will guide the Global Alliance’s approach and focus:
Each of the Working Groups has defined goals for 2014, informed by preparation for and feedback received at the plenary and satellite meetings. While effective progress will only come if each group has focus and expertise, the issues are crosscutting: it is agreed that in many cases the Working Groups will need to collaborate with each other, the broader Alliance membership, and other stakeholders to achieve these goals.
In order to root the activities of the Global Alliance in real-world problems and to demonstrate the value of interoperable approaches to data sharing, the Alliance will support specific projects. These may be initiatives undertaken by partner organizations and seen as exemplar efforts to help demonstrate needs and guide work for the Alliance through one or more Working Groups.
Examples of such projects that were discussed at the Partner Meeting and received substantial interest and support include the International Cancer Genome Consortium (ICGC), Genomic Matchmaker initiative (now referred to as “Matchmaker Exchange”), the Beacon project, the P3G-IPAC (Public Population Project in Genomics and Society International Policy interoperability and data Access Clearinghouse), and efforts to define genotype-phenotype relationships at specific genes of high interest, among others.
Ongoing engagement between these projects and Working Groups is intended to encourage a focus on the needs of projects currently advancing science and medicine, and crosscutting engagement of the Working Groups with one another and with stakeholders in the community.
Given the broad remit of the Alliance and its diverse membership, it can be challenging to ensure that different members and groups have a shared understanding of goals and tasks. To articulate and communicate clearly shared goals coming out of the plenary meeting, an effort has started to develop specific use cases that can define, delineate, and guide the near-term work of the Global Alliance.
Each use case will include a specific objective and spell out the rationale for this objective, followed by the roles and specific steps required to achieve that objective. In addition to defining, communicating, and focusing Global Alliance activity, use cases can also identify interdependencies between Alliance Working Groups, projects, and partners.
Already, the initial development of use cases has highlighted the different levels of data sharing that are envisioned under different scenarios: from group queries that do not involve exchange of individual level data (e.g., whether a dataset contains information on a given genotype or phenotype, without actually revealing that information, or exchanging information on consent or security standards), to data sharing at the individual level (e.g., of limited or broad-based information on genotype and / or phenotype). Moreover, sharing of information that is performed within a given regulatory domain raises different questions than those that cross jurisdictional domains.
Much of the March 4th Partner Meeting was focused on sharing current efforts and defining near-term goals for the four initial Working Groups. Active participation and feedback on the current work and goals were encouraged from meeting participants. Below is a summary of
what was presented and discussed during the March 4th Partner Meeting by each Working Group. (The summaries below do not reflect revisions made in response to feedback at the meeting.)
Using the full-day Working Group satellite meetings and the discussions summarized below, each Working Group then revised and expanded their near-term priorities that will drive their efforts. Updated priority documents for each group can be found at: http://genomicsandhealth.org/our-work/working-groups.
Data Working Group
David Haussler (University of California, Santa Cruz, U.S.A.), co-Chair of the Data Working Group, updated meeting participants on the Working Group’s activities, including the activities of Task Teams under the Data Working Group. These Teams currently include representatives from several academic centers, the European Bioinformatics Institute (EBI) and US National Centre for Biotechnology Information (NCBI), as well as companies such as Google, Microsoft, and Amazon. The plan is to use open Internet resources and engagement of interest groups around each Task Team to facilitate broader input.
During the meeting, it was discussed that genomic data is currently stored and shared in a variety of formats – BAM, CRAM and VCF – but these file formats have considerable shortcomings in a global, increasingly cloud-based, environment, and were not designed for the clinical settings in which they are now being used. Despite the staggering amount of genomic data being produced globally, there is very little governance of data exchange formats.
The presentation described how the Data Working Group has incorporated the creators of the existing file formats (e.g., BAM, CRAM, VCF) into a special File Formats Task Team that it has been agreed that going forward, the Data Working Group will oversee and provide governance for the management of these genomic data file format standards. The Team supports BAM and CRAM for representing sequence reads and their alignment to reference genomes, and VCF for representing genetic variation in individuals.
During the presentation, it was noted that while file formats for data exchange have been useful, they are not sufficient going forward. File formats will not allow data systems to scale up from thousands of genomes to millions of genomes unless they are accompanied with formal data models and application programming interfaces (APIs) that allow large data sets of genome “reads” to be reorganized for efficient automated access along multiple dimensions. Because of this, David Haussler shared that over the next two years, a Read Store Task Team of the Data Working Group will act as an international coordinating effort in devising formal data models and APIs for representing, submitting, and exchanging, querying and analyzing genomic data.
The presentation also focused on the fact that the current Balkanized system of naming and indexing human genome variation relative to a canonical human reference genome structure will be challenged to scale to the clinical application of millions of genomes. Another Task Team has been set up on ‘Reference Variation’ to work with the community to develop the next generation of human genetic reference that includes known variation and will scale properly. The key activities are to identify known variants, resolve inconsistencies, and create standardized format for novel variants.
Many partners offered feedback that, based on their experience, developing APIs requires both cooperation and competition. Meeting participants agreed that cooperation will be vital in creating a common API, but once this application interface is agreed upon, competition between technical groups will be important in creating the most efficient technological solution. The phrase “open interface, competition on implementation” was used to capture this spirit.
In addition to standards, it was relayed that the Task Teams are each developing reference implementations and publicly (openly) available benchmark data sets so as to support evaluation of methods, e.g., for genetic variant calling. The intention is that benchmarking be available to any developer any time. One attractive possibility is to use cloud services so that in addition to comparative evaluation based on accuracy, different implementations can be evaluated in terms of time and memory requirements within the same run time environment, and so that all results are reproducible because executable code is stored.
Finally, David Haussler shared that two more Task Teams are envisioned to launch in the coming year: one to create an API for the representation of gene expression and the epigenetic state of DNA in a tissue sample, and one for metadata, which is general information about a sample such as tissue type, including how, when and where information was extracted from that sample, such as the name of the sequencing center.
Regulatory and Ethics Working Group
At the meeting, Bartha Knoppers (McGill University, Montreal, Canada) and Kazuto Kato (Osaka University, Osaka, Japan), co-Chairs of the Regulatory and Ethics Working Group, outlined the group’s initial focus on developing an International Code of Conduct for Genomic and Clinical Data Sharing.
The co-Chairs discussed how such a code is intended to articulate a set of ethical principles for research and the sharing of genomic and clinical data, covering key issues such as consent, privacy, feedback to patients and research participants, data deposit and access and sanctions. The Code would define protections against those who misuse such data that are volunteered for the public good.
The Regulatory and Ethics Working Group is creating this Code of Conduct through collaboration with multiple international consortia, including H3Africa, the Biobank Standardisation and Harmonisation for Research Excellence project (BioSHaRE), the International Cancer Genome Consortium (ICGC), and the International Rare Disease Research Consortium (IRDiRC).
The co-Chairs noted that funders, research consortia, and clinicians all operate within their own national frameworks and rules, but global data-sharing requires meta-level governance based on a universally shared human rights framework for addressing international problems. In addition to bioethics norms, there is a need for robust tools for the governance of genomics research. Such tools can responsibly steer the sharing and stewardship of research discovery and the integration of genomic data with clinical data for genomic medicine, namely an International Code of Conduct.
The presentation shared how an International Code of Conduct would:
To develop the Code, the Working Group is drawing from different instruments, such as the International Declaration of Human Rights, which includes the right of access to the benefits of scientific research for all and the right for scientists to be recognized for their work.
The presenters described how an International Code of Conduct is timely, as both Europe and major economies such as India, Japan and China, are setting up new data protection legislation laws. An International Code of Conduct could inform these laws while they are being created.
The co-Chairs also shared that the Working Group is developing the idea of a privacy safe haven. While genomics experts are accustomed to sharing information between themselves, sharing data internationally by clinicians offers new challenges and such “havens” require structure and legitimacy.
Finally, it was discussed that in order for data to be shared across jurisdictions, international ethics review processes need to be streamlined including criteria for international recognition of the possible equivalency of such processes. The goal would be that ethics review undertaken in one country that meets such criteria could be recognized by other participating countries as substantially equivalent.
Clinical Working Group
Kathryn North (Murdoch Childrens Research Institute, Melbourne, Australia), co-Chair of the Clinical Working Group, presented the work and goals of the Clinical Working Group, explaining that the biggest focus initially has been to map existing endeavors—so as to build on current efforts, rather than “re-inventing the wheel.”
The presentation focused on the mandate of this group, which is to provide information to the genomics community on standards for representing phenotypic data and linking it to genotypic information. The initial focus will be on rare genetic disorders and cancer, later expanding to more complex disorders.
The first areas the group focused on are: (a) phenotype ontology (recording and sharing patient data in standardized manner); (b) data harmonization; and (c) biomedical informatics and data extraction.
The presentation closed with a summary of the future goals of the Working Group. The group has identified the following key needs: establishing compatible, readily accessible, and scalable approaches for sharing clinical data and linking genomic data; and facilitating genotype-based clinical trial recruitment for rare diseases and cancer. The longer-term goal is to incorporate phenotypic data, linked with genomic data into the electronic health record as a secure and individualized record of all relevant data for a specific patient that can then be linked more broadly to international data sharing initiatives.
Security Working Group
Paul Flicek (European Bioinformatics Institute, UK), and Dixie Baker (consultant to Genetic Alliance, US), co-Chairs of the Security Working Group, outlined some of the key issues for the Security Working Group. A core principle of security, they said, is that security that is transparent tends to be most effective. Institutions also need to have a pragmatic attitude about the realities of security breaches.
During the presentation, it was emphasized that the goal of the Security Working Group is to identify and support data technology solutions that provide assurance to patients, researchers, clinicians and other stakeholders that genomic data are shared, annotated, and interpreted only by those with appropriate authorisation.
The co-Chairs shared that core issues that the Working Group is focusing on include: identity management; access management; data integrity and immutability; audit and non-repudiation; availability; federated data-sharing models; and processing requirements.
They emphasized that defining and communicating among data-sharing partners that equivalent security practices and technologies are being used will be critical to creating this transparent and pragmatic environment. To this end, driving the work of this group will be the definition of use cases that define: what types of data will be shared; whether sharing networks will involve a few repositories or many; whether data is stored on site by or on shared “cloud” services; who will credential individual users and identity-proof individuals and what level of assurance is required; and how global sharing policies will interact with national law.
The presentation noted that several relevant standards and technologies for protecting the privacy, confidentiality, and integrity of large data sets, and for federating databases across globally distributed entities already exist in areas beyond genomics, such as healthcare, business and finance. The group will review these technologies and standards, and road-test various security mechanisms.
The co-Chairs said that by 2015 the group plans to publish guiding principles, standards, and a technical framework that fulfill a set of requirements derived from use cases designed to enforce the policy within an environment that represents relevant technical, practical and community perspectives of genomic data sharing.