Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Our Strategic Road Map defines strategies, standards, and policy frameworks to support responsible global use of genomic and related health data.
Discover how a meeting of 50 leaders in genomics and medicine led to an alliance uniting more than 5,000 individuals and organisations to benefit human health.
GA4GH Inc. is a not-for-profit organisation that supports the global GA4GH community.
To guide our collaborative, globe-spanning alliance, GA4GH relies on a Standards Steering Committee and an Executive Committee.
The Funders Forum brings together organisations that offer both financial support and strategic guidance.
The EDI Advisory Group responds to issues raised in the GA4GH community, finding equitable, inclusive ways to build products that benefit diverse groups.
Distributed across four Host Institutions, our staff team supports the mission and operations of GA4GH.
Curious who we are? Meet the people and organisations across six continents who make up GA4GH.
More than 500 organisations connected to genomics — in healthcare, research, patient advocacy, industry, and beyond — have signed onto the mission and vision of GA4GH as Organisational Members.
These core Organisational Members are genomic data initiatives that have committed resources to guide GA4GH work and pilot our products.
This subset of Organisational Members whose networks or infrastructure align with GA4GH priorities has made a long-term commitment to engaging with our community.
Local and national organisations assign experts to spend at least 30% of their time building GA4GH products.
Anyone working in genomics and related fields is invited to participate in our inclusive community by creating and using new products.
Wondering what GA4GH does? Learn how we find and overcome challenges to expanding responsible genomic data use for the benefit of human health.
Study Groups define needs. Participants survey the landscape of the genomics and health community and determine whether GA4GH can help.
Work Streams create products. Community members join together to develop technical standards, policy frameworks, and policy tools that overcome hurdles to international genomic data use.
GIF solves problems. Organisations in the forum pilot GA4GH products in real-world situations. Along the way, they troubleshoot products, suggest updates, and flag additional needs.
NIF finds challenges and opportunities in genomics at a global scale. National programmes meet to share best practices, avoid incompatabilities, and help translate genomics into benefits for human health.
Communities of Interest find challenges and opportunities in areas such as rare disease, cancer, and infectious disease. Participants pinpoint real-world problems that would benefit from broad data use.
See all our products — always free and open-source. Do you work on cloud genomics, data discovery, user access, data security or regulatory policy and ethics? Need to represent genomic, phenotypic, or clinical data? We’ve got a solution for you.
All GA4GH standards, frameworks, and tools follow the Product Development and Approval Process before being officially adopted.
Learn how other organisations have implemented GA4GH products to solve real-world problems.
Help us transform the future of genomic data use! See how GA4GH can benefit you — whether you’re using our products, writing our standards, subscribing to a newsletter, or more.
Help create new global standards and frameworks for responsible genomic data use.
Align your organisation with the GA4GH mission and vision.
Solve your real-world data problems with support from this valuable network of global institutions.
Work with like-minded groups committed to better data use in areas like rare disease, cancer, and infectious disease.
Share your thoughts on all GA4GH products currently open for public comment.
Solve real problems by aligning your organisation with the world’s genomics standards. We offer software dvelopers both customisable and out-of-the-box solutions to help you get started.
Learn more about upcoming GA4GH events. See reports and recordings from our past events.
Speak directly to the global genomics and health community while supporting GA4GH strategy.
Be the first to hear about the latest GA4GH products, upcoming meetings, new initiatives, and more.
Questions? We would love to hear from you.
Read news, stories, and insights from the forefront of genomic and clinical data use.
Attend an upcoming GA4GH event, or view meeting reports from past events.
See new projects, updates, and calls for support from the Work Streams.
Read academic papers coauthored by GA4GH contributors.
Listen to our podcast OmicsXchange, featuring discussions from leaders in the world of genomics, health, and data sharing.
Check out our videos, then subscribe to our YouTube channel for more content.
View the latest GA4GH updates, Genomics and Health News, Implementation Notes, GDPR Briefs, and more.
Discover all things GA4GH: explore our news, events, videos, podcasts, announcements, publications, and newsletters.
17 Oct 2014
GA4GH held its second major meeting on Saturday, October 18, 2014 in San Diego, California. The purpose of the plenary session was to convene the diverse community of GA4GH members and stakeholders, share important progress made by the four Working Groups and projects being accelerated, and collectively discuss next steps for GA4GH to advance and scale up current efforts in genomic data sharing to improve human health.
The Global Alliance for Genomics and Health (Global Alliance) held its second major meeting on Saturday, October 18, 2014 in San Diego, California. The purpose of the plenary session was to convene the diverse community of Alliance members and stakeholders, share important progress made by the four Working Groups and projects being accelerated, and collectively discuss next steps for the Alliance to advance and scale up current efforts in genomic data sharing to improve human health.
This summary document includes a brief description of key themes, followed by a more comprehensive summary of each speaker’s remarks, featuring goals, accomplishments, next steps, and participants’ questions and feedback. In addition, the Appendix of this document includes brief descriptions of each Working Group’s all-day planning meeting and the cross-Working Group meeting that took place in addition to the plenary session.
There was strong consistency in the messages and key issues raised by the presenters, panelists, and meeting participants. These included:
Keith Yamamoto (University of California, San Francisco) opened the plenary meeting, framing the work of the Alliance as remarkable and inspiring within the broader field of genomics and health and at the “leading edge of precision medicine.”
Yamamoto described the Alliance as holding the promise to help spark a revolution across research, health, and healthcare. Yet, the gap between the data available and our understanding of its immense complexity is vast. The challenge confronting the Alliance is how to move through that information deluge, creating a “knowledge network” from the vast amount of data enabling precise diagnosis and treatment decisions for each individual, empowering further research, advancing clinical care, and informing patients and citizens.
Yamamoto outlined three types of reinforcing actions to continue to drive the mission of the Alliance.
First, the field requires significant additional data integration activities, including building the technical ability to integrate, store, and analyze genomic information.
Second, moving to a precision medicine approach will better serve the research and clinical communities. Yamamoto proposed a thought experiment: what if disease was classified by its genetic mechanism? One disease, such as breast cancer or diabetes, might be described by its multiple underpinning genomic mechanisms, just as one mechanism may be implicated in more than one disease.
Research and academic institutions in the Global Alliance network can advance this type of discovery, promote the development of more accurate disease classification, and advance diagnosis, therapy, and care. Operationalizing this research to health continuum would require four elements: an information commons and knowledge network, embracing the goal of precision medicine, building a layered knowledge network across disciplines, and drawing on network connections to carry innovation and discovery to new areas of research and patients groups.
Yamamoto described the power of a precision medicine approach in the case of Traumatic Brain Injury (TBI) in the U.S., where genomics is becoming an increasingly important part of diagnosis and treatment. Instead of the traditional one-size-fits all approach, some researchers are examining not just imaging and clinical data, but patient proteome and genome data and discovering new linkages between TBI expression and an individual’s genetic makeup.
Third, Yamamoto described the need to foster system integration not only between science disciplines, but all along the biomedical continuum, and among all stakeholders.
Yamamoto stressed that the Alliance is foundational on all three of these planes, and is at the lead of being able to develop the idea of precision medicine into practice.
David Altshuler (Broad Institute of MIT and Harvard), Chair of the Global Alliance Steering Committee, welcomed all attendees to the second plenary, and stressed the important moment for the Alliance. Altshuler reminded attendees that the field is in the midst of a data revolution, and the Alliance is addressing some of its greatest challenges: developing trusted routes for data sharing, effective managing of privacy issues, and respecting the diversity of views for how to drive efforts forward.
Altshuler stressed that the stakes are high, and that it will be up to the people in the room and involved in Alliance to make sure the field develops in the right way before we lose this window of opportunity. But he stressed that there is great potential to unleash a new era of innovation in health.
For over a year, the Global Alliance has been working to accelerate progress in human health by helping to establish a common framework of harmonized approaches to enable effective and responsible sharing of genomic and clinical data, and by catalyzing data sharing projects that drive and demonstrate the value of data sharing.
Altshuler discussed how the collaborative, iterative, and transparent model of the Alliance has a history in the technology space, but is newer to healthcare. Nevertheless, the progress of the Alliance has been substantial.
He noted that the Alliance itself has transitioned from a nascent coalition to a formal membership organization, with an established Constitution and composed of 141 exciting and unique organizational members in 23 countries to date. Altshuler also mentioned that diversity is vital to the Alliance’s current and continued progress, and one of the priorities for the coalition is to actively to engage an even more diverse set of organizations and individuals in the coming months.
Before wrapping his presentation, Altshuler looked to the future, and how a widespread willingness to share data for the greater good is needed to realize the benefits of the work. He looked to the cultural shifts and incentives that will be needed to realize greater sharing, while respecting privacy and security of individuals.
Altshuler closed by describing the Global Alliance as poised to deliver larger-scale products in the coming months, driven by the community’s incredible passion and energy to advance human health.
Bartha Knoppers (McGill University) and Kazuto Kato (Osaka University), the Chair and Co Chair of the Regulatory and Ethics Working Group (REWG), introduced the Working Group’s mission to address the ethical, legal, and social implications of enabling responsible genomic and clinical data sharing. This includes preparing the overall policy Framework for data sharing, and developing forward-looking governance policies pursuant to the Framework on consent, privacy and security, and accountability.
Knoppers emphasized that progress in the past year has been substantial: the Regulatory and Ethics Working Group launched six active Task Teams, fostered broad international engagement, entered into discussions with multiple international consortia, authored numerous publications, and created a major policy document, the Framework for Responsible Sharing of Genomic and Health-Related Data.
The Regulatory and Ethics Working Group Chair described the six active Task Teams launched in 2014 that allow for the benefits of genomic research while protecting individual privacy.
Currently, the REWG is establishing policy-specific Task Teams to draft content-specific policies guided by the Framework. This includes defining the key attributes of a data “safe haven” for secure and trusted genomic and clinical data sharing, a review of ethics review regimes, and developing a “points to consider” policy for establishing ethics review equivalency across intuitions. Knoppers further announced that two new policy-specific Task Teams, one on Accountability and the other on Privacy and Security, are in the process of being formed, based on need and strong member interest.
Knoppers concluded her presentation by explaining that the REWG is poised to continue advancing responsible data sharing in three avenues:
Questions from the plenary attendees included which policies and international developments the co-chairs are most closely watching. In response, the REWG Co-Chairs described concerns and policy changes regarding the protection of personal data, not only in the European Union, but in the United States, and around the world. Co-Chairs invited members of the Global Alliance to channel information on personal data protection policies and news to the REWG to ensure real-time engagement and response in this dynamic area.
REWG next steps:
Paul Flicek (EMBL – European Bioinformatics Institute) and Dixie Baker (Genetic Alliance), Co-Chairs of the Security Working Group (SWG), described the need to lay a foundation for securing a data sharing commons, and to create a technology environment that provides individuals, researchers, clinicians, and other stakeholders assurance that data made available are shared, annotated, and interpreted only by appropriately authorized persons and entities in accordance with Global Alliance security and privacy policies. The Co-Chairs described their goal to help create a technology environment that provides such assurances, and the Global Alliance as a key enabler for this type of productive security ecosystem to emerge.
The major product being developed by the SWG is the Global Alliance Security Infrastructure, Version 1.0. The objectives of the Security Infrastructure are to manage five types of genomic data security risks: 1) unauthorized disclosure of stakeholder data, 2) unauthorized access or use of stakeholder data, 3) corruption or destruction of data, 4) disruption or degradation of services supporting availability and access to data, and 5) that inappropriate actions result in security incidents that diminish participation in the Global Alliance ecosystem.
The Security Infrastructure provides recommendations for
security technology infrastructure to support Global Alliance privacy and security policies, Baker announced that Version 1.0 of the document has been published on the Global Alliance website. The Infrastructure is a “living document,” and as such, has been published with a link to enable submission of comments.
Looking to the future, the Co-Chairs described four primary areas of focus for the SWG. The first is establishing a collaborative effort for security incident reporting. Second, the SWG plans to develop standards for consent management. Third, the Working Group looks to test the proposed Security Infrastructure against existing and emerging sharing activities, particularly drawing on the expertise and insights of other working groups. Finally, the group will provide cross-cutting advice to other Alliance Working Groups and initiatives to ensure harmonized approaches to responsible data sharing. Three new SWG Task Teams were also proposed: Incident Response, Software Security, and Cloud Security.
In response to a question about the amount and type of data that can be released before a privacy breach, Flicek discussed reframing the concern from creating uniqueness from a data breach to doing harm. Whereas creating uniqueness is a technical concern, the question of when a data breach crosses over into harm is a policy one that should be further discussed. When asked a question on incident handling, the SWG Co-Chairs described the rapidly changing area of application security and, from a technical perspective, that such measures must be embedded in any data sharing approach.
SWG next steps:
David Haussler (University of California, Santa Cruz) and Richard Durbin (Wellcome Trust Sanger Institute) introduced the Data Working Group (DWG), describing the work of the group to overcome siloed data in incompatible systems and break down barriers to collaboration.
A global, accessible, and collaborative way of working defines the broader mode of operation of the DWG, which utilizes an open source software development environment at https://github.com/ga4gh. All individuals are welcome to participate and decisions are made nimbly by those most active in the open source development process.
In the past year, the DWG has launched five task teams to overcome key barriers to the technical sharing of genomic data:
The Co-Chairs also announced the Working Group will incubate four new task teams, building on the success of current efforts:
The Co-Chairs further described an exciting idea that was discussed at their group workshop the preceding day: Globally Unique Content-based Identifiers or Digests. Haussler raised the idea of producing concrete technology so that any genome dataset in the world can have an abstract identifier that is 1) unique, 2) privacy preserving, 3) not centrally assigned, 4) independent of the computational representation of the data, and 5) unforgeably linked to the content of that dataset. This identifier system, based on a cryptographic hashing method, would be vital for verification, de duplication, and for auditable tracking of reproducible analysis and inference. The ideas extend to other data types beyond genome sequence data.
When asked about the DWG’s engagement with research funders, the Co-Chairs responded that the group is engaging additional funders in several countries who are eager to participate in the Group’s activities. David Altshuler added that the Alliance Secretariat is working to secure funding to support these advancements, and asked the Alliance community to put the group in touch with additional funding agencies.
When Alliance members at the plenary raised that the rare disease community is also considering the development of globally unique identifiers, the distinction between identifiers for individuals and identifiers for datasets was discussed (the digest concept is the latter) and it was proposed to review the cross-working group implications of these efforts and report out on this issue in the coming months.
In response to a final question of how available the tools of the DWG are to the broader global community, DWG Co-Chairs explained that these tools are available to anyone who can access the Internet, but acknowledged that development of tools must be combined with education and outreach to increase contributions and uptake. Once the reference implementation is complete, Haussler stated that the DWG will be in a stronger position to reach out actively and broadly to potential users.
DWG next steps:
Kathryn North (Murdoch Childrens Research Institute) began by noting that the Clinical Working Group (CWG) is driven by one question: how do we represent phenotypic data and link it to genotypic information? The goal of the CWG is to address both the research and clinical use of genomic data, utilizing an approach that is physician-oriented, researcher-focused, and patient centered.
North described the major activities of the CWG, including the current mapping of initiatives that promote data sharing. In addition to working closely with the BRCA Challenge and the Matchmaker Exchange, the CWG hosts the following four task teams.
In response to a question about the next steps for linking the CWG’s cutting-edge genomics work to electronic health records, North responded a likely path would be selecting a few demonstration projects, sparking a lively discussion about the need to involve major private electronic health records providers (several of which had already been invited to participate). Alliance leadership agreed that while the CWG will remain technology agnostic, the participation of the private sector is of critical importance.
A second area of discussion centered on a question of integrating non-human model organisms into the work of the CWG. North agreed to explore this important idea, which was also raised in the context of the DWG’s Metadata Task Team.
Plenary attendees also raised questions about better integrating disease into the CWG’s approach. In response to a question about the place of infectious disease in the CWG, the Alliance agreed to convene a group to explore the idea of an Infectious Disease Working Group or Task Team in the next months. Those interested in exploring this idea were invited to volunteer or suggest others who should be involved.
Finally, in response to a question about how the CWG plans to engage the patient community into the future, North looked to explore ways to encourage the patient community to be more active in responsible sequencing and data sharing.
CWG next steps:
Sir John Burn (Newcastle University) and Stephen Chanock (National Cancer Institute), BRCA Challenge Steering Committee Co-Chairs, introduced the newly launched BRCA Challenge, the mission of which is to translate the rapid expansion of sequencing capacity into useful knowledge and, in particular, learn how to rapidly interpret variant data to generate clinical utility.
The Steering Committee Co-Chairs described the BRCA Challenge as a vanguard effort to aggregate BRCA1 and BRCA2 data in order to understand variation and its impact on human health, while demonstrating how to approach large datasets for other disease areas of study. Since its recent formation, the BRCA Challenge has formed a Steering Committee and is formalizing strategic goals, a structure, key deliverables, and an aggressive timeline for progress.
The BRCA Challenge announced its first steps will be ensuring that three major datasets (ClinVar, LOVD, and UMD) are queriable. The group then plans to expand to include other datasets (like ENIGMA, CIMBA), and seek out additional sources of data to add to the Challenge as well.
Burn and Chanock proposed releasing the three deliverables:
The Co-Chairs concluded that advancing the BRCA Challenge, an effort that has captured the imagination of many in the room and around the world, in the longer term requires sustaining infrastructure, retaining expert leadership, and providing opportunity for grants and contracts to advance knowledge.
BRCA Challenge next steps:
Heidi Rehm (Harvard Medical School) discussed Matchmaker Exchange, one of the key projects that the Global Alliance is working to accelerate. Matchmaker Exchange is already working to bring together cases with overlapping phenotype and candidate genes both to gain research insights to return information to patients.
Rehm described how conceptually, matchmaking occurs when a submitter queries all API-linked databases with gene candidates, disease name, and the submitter’s. If a match occurs, the depositor and the requestor are both notified and details of case are shared manually to begin follow-up studies to validate the match.
Rehm announced two major developments with the project. First, the development of API Version 1.0, which is already forming an interface between several distinct genomic databases. Second, Rehm announced the creation and launch of a new website www.matchmakerexchange.org which will form the basis upon which the group communicates with the broader community of interest.
She noted that these accomplishments are being accelerated through close collaboration with each of the Alliance Working Groups: the DWG providing support for the API development, the REWG providing feedback on consent protocols, the SWG providing guidance on query authentication, and the CWG sharing expertise with phenotyping for matchmaking.
To build on this initial progress, the Matchmaker Exchange project plans to finalize the API in conjunction with the GA4GH Data Working Group and to develop guidance for groups without a database wishing to choose a site for data deposition and matchmaking support.
Matchmaker next steps:
Steve Sherry (National Center for Biotechnology Information) presented the accomplishments of the Beacon Project, a project first proposed at the Alliance’s March 2014 plenary meeting as a way to demonstrate the leadership, institutional permission, and technical ability to share data. Beacons also show where data is present, and where gaps in mapping may need to be addressed.
Sherry noted that since the initial proposal in March 2014, fifteen beacons are now active around the world, with more to come. The beacons themselves are an API query that indicates if a simple allele is present in an affiliated institution’s data repository, and the group announced it will soon release a Beacon of Beacons, in which a single beacon query searches each database in a federation of repositories. Sherry issued a call to the Alliance community to join in the project to link additional dozens and hundreds of data repositories to the effort. The Beacon Project is also collaborating with data repositories to finely tune beacon queries to adhere to individual repository concerns about data sensitivity or consent requirements.
Overall, the Beacon Project has established three tiers of beacon queries, based on the strength of an affiliation between users and data repositories. The most basic relationship is with an anonymous user who can gain access to genomic information that can be provided without privacy concerns. Sherry mentioned that users who provide a verifiable name and institutional affiliation may access additional genomic information that may be scientifically useful, and in the final case, a
user may enter into a binding agreement for access to one or more specific datasets, including the commitment to abide by clear and robust privacy standards, and gain full access to the relevant data.
Plenary attendees raised several questions about protections for data and the possibility of donor identifiability in the case of forensic or law enforcement queries and other scenarios. Currently, beacons only point users to aggregated information from a group of studies, and not to individual level data. However, Alliance leaders agreed that even providing aggregate information is delicate, requiring guidelines for what types of aggregated data can be released, its implications, and the risks of identification.
Beacon next steps:
The final session during the plenary meeting was a panel discussion moderated by John Mattison (Kaiser Permanente) and featuring panelists David Glazer (Google), Brad Margus (A-T Children’s Project), Johan den Dunnen (Leiden University), Cindy Bell (Genome Canada), and Lana Skirboll (Sanofi). These diverse panelists were selected because of their different vantage points throughout the world of genomics and their alternative perspectives on big, new opportunities to advance the Global Alliance.
John Mattison, as moderator, led off the discussion noting that while genomic complexity is vast, putting datasets together as the previous presentations described allows us to understand incredibly complex interactions in a more meaningful way. Mattison noted that privacy concerns are still of critical importance to this field, but over time, social norms may evolve and shift the value proposition in favor of increased sharing.
Lana Skirboll, who spent her career at NIH before entering the private sector at Sanofi, raised a number of key issues for the field. She described industry as being in the middle of a transition, increasingly recognizing the importance of data sharing and open innovation through increased engagement with public-private partnerships, working with academics, and others. Skirboll mentioned that governments are also transitioning, as they begin to realize that data are an economic asset, restricting movement across borders, and that these are political developments to pay attention to and engage in the context of trade treaties and other means.
Skirboll also raised the issue of law enforcement access to data, mentioned in an earlier presentation, and believes that access is inevitable and suggested this community focus on identifying the risks and defining informed consent policies. Another emerging area of data protection is for the data emerging from the exploding field of real-time, personal sensors. Finally, in this shifting world, Skirboll recommended that the Global Alliance consider working more closely with regulators to ensure approaches are informed and aligned.
David Glazer described what he called a “thought-experiment” to illustrate key areas of focus for the Global Alliance. Glazer asked: If the genomics and health community wanted to put together a database of 100,000 genomes that was useful and available in a matter of months, what he called a Tree of Life for researchers, what would be required?
Glazer noted that the first area to tackle would be the technical sharing of data, and this would likely require a federated project to connect data repositories and quickly get to scale. Data hosts would form the branches of the structure, filled in with sets of data provided by many data collectors and contributors. Contributors could include highly mobilized individuals and groups, like Autism Speaks or Brad Margus and the A-T Children’s Project, whose energy and resources would be vital. But Glazer emphasized that this is already technically doable.
What is more challenging, Glazer argued, is the policy, access, and consent rules that govern the sharing and use of this information. Glazer concluded that in his opinion, the best way forward on the policy side is to develop a portable consent process where data donors can choose to make their data available to all qualified researchers, without the researchers needing to apply for access to every “branch of the tree” individually.
Brad Margus shared his family’s own struggle with disease, as two of Margus’s young sons were diagnosed with ataxia-telangiectasia (A-T) a rare and devastatingly degenerative disease. Margus has since founded the A-T Children’s Project and devoted himself to coordinating and seeking funds for research on A-T. Recent advances in data sharing helped Margus to identify individuals whose genomic information may hold a key for understanding more about this disease. Furthermore, Margus has begun raising funds to perform additional sequencing, another avenue that has only been available in recent years. From the perspective of a leading disease advocate, Margus reiterated the importance of standardized, flexible consent forms to aid in the responsible collection and use of genomic data to fight disease.
Cindy Bell described her perspective as a funder in the field of genomics and health. Echoing Keith Yamamoto’s opening remarks, Bell described the genomics and health community as being at an inflection point, where additional impact must be demonstrated. Bell praised the incredible energy of Global Alliance contributors, recognizing the voluntary efforts that have driven much of the coalition’s progress to date.
Yet Bell acknowledged that volunteerism, while a powerful demonstration of commitment, is not sustainable in the long run and that targeted investments are vital. Bell described her organization’s own launching of a pilot project to support the objectives of the Global Alliance in Canada and encouraged similar investment. Remarking on the shifting field, Bell concluded that costs have shifted in recent years from sequencing to data management and infrastructure, and noted that funders are increasingly supportive of public-private partnerships to achieve impact.
Johan den Dunnen, the final panelist to offer remarks, spoke next from the perspective of database management, as the founder of the Leiden Open Variation Database. He stressed that DNA diagnostics, DNA knowledge, is based on sharing information on genes, variants and phenotypes, and emphasized sharing. Years ago, surprised and frustrated by the lack of data sharing, den Dunnen embarked on a remarkable personal project to create what is now LOVD, an open source database of DNA variants. Den Dunnen described a culture that is shifting, but yet still largely hasn’t overcome resistance to uploading data or attracting funding for this essential work. He suggested making sharing data obligatory by law with associated payment to host and curate the data.
After these brief panel remarks, a question from the audience about how the Global Alliance might successfully handle instances of data misuse and regulate the field sparked a lively discussion about responsibilities and priorities. Alliance Steering Committee Members, including Altshuler and Knoppers, agreed this is an issue of great significance for the future of the Alliance, and agreed that this would be a future focus of the effort to discuss how best to handle these issues as a group.
Martin Bobrow (University of Cambridge, Emeritus) delivered closing remarks at the plenary meeting, summarizing the productivity of the coalition and energetic mood of the meeting. Bobrow described the Alliance as both a philosophical discussion and social movement for responsible data sharing, connected to an even broader network of researchers, clinicians, patient’s advocates, and more.
Ultimately, Bobrow said, the Alliance will succeed or fail not on its philosophy, but rather on its ability to produce transformative ideas and products that are both high quality and highly relevant. To have this impact, the challenge for the Alliance is choosing the most impactful areas of genomics and data sharing to focus on in the coming months, amidst so many promising avenues. Many new and exciting ideas were shared at the plenary meeting today, and members of the Alliance will be taking up a number of them.
Bobrow pointed out that one of the successes of the Global Alliance is that so much has been achieved to date through an essentially voluntary effort, proof that this work is highly important and interesting to this field. Although, to have transformative impact over time, this work must be sustained by funders.
Bobrow concluded by thanking the attendees, participants, and Alliance leadership, and invited all to the next plenary meeting in 2015.
The Clinical Working Group continues to evolve. The Catalogue of Activities for Mendelian Genetic Disorders represents a major deliverable in 2014, on which several subsequent Work Products will be modelled. The following Task Teams reported on their work and next steps:
Phenotype Ontology Task Team
Clinical Cancer Genome Task Team
eHealth Task Team
Many tasks were identified and assigned to move forward including:
Future Goals and Points of Agreement
The CWG is looking to engage the broader interested community and solicit advice and feedback in San Diego. A few recommendations raised in San Diego include: creating a website similar to the DWG Github to showcase current work and more easily allow others to participate, formalizing a communication strategy (e.g. regular newsletters, greater engagement with other parts of the world such, etc.), formalizing cross communication among Working Groups, convening meetings in other parts of the world, and generating greater engagement with clinicians and scientist. These comments reflected both an awareness on behalf of the CWG as well as the desire on behalf of the interested community to see that the CWG is more transparent and more accessible.
Potential next steps that were discussed included:
Before deciding next steps in these areas, the CWG will investigate what efforts currently exist and then determine whether to create a new deliverable/project/task team.
In 2014, the Data Working Group released the GA4GH API version 0.5 and the API continues to improve and evolve. There is broad enthusiasm for moving both existing and new the initiatives forward. The following Task and Project Teams reported on their work and next steps:
Reads and Reference Variation Task Team
Benchmarking Task Team
File Formats Task Team
Metadata Task Team
RNA and Gene Expression Task Team (NEW)
Genome Annotation Task Team (NEW)
Genotype2Phenotype Association Task Team (NEW)
Containers and Workflows Task Team (NEW)
Future Goals and Points of Agreement
The concept of globally unique content-based identifiers or digests was defined. Any version of any genome sequence dataset (or other large dataset) in the world can have an abstract identifier that is: 1. unique for that dataset version (no “copy” of that sequence+metadata at any other location at any time in the future will ever have a different identifier, and no two different versions will ever “collide” by accidently getting the same identifier)
In 2014, the Regulatory and Ethics Working Group published the Framework for Responsible Sharing of Genomic and Health-Related Data. It is referenced in the Global Alliance Constitution. The Working Group also published Consent Tools (see below).
Framework Task Team
Data Safe Havens Task Team
Consent Task Team
Ethics Safe Harbor Task Team
Data Protection Regulation Task Team
Privacy and Security Policy Task Team
Future Goals and Points of Agreement
The Consent Task Team is also currently discussing with GSK representations/Harvard MRCT participants how the Consent Tools can inform consents for clinical trials and industry standard consent forms. Additionally, the Consent Task Team is looking at the concept of machine-readable consents (noting that HL7 is working on machine readable consent directive, as is the World Economic Forum).
In the future, the Data Protection Regulation Task Team plans to explore genomic cloud computing and data protection issues.
Noting that a paper has been written by REWG Executive Committee member Paul Burton and colleagues (currently undergoing peer review) that explores the topic of “data safe havens”, it was agreed that a “data safe haven” is a place where data can be stored and accessed by all types of groups, that can be trusted by all parties, and that hosts genomic and clinical data, in both open and controlled formats.
On the subject of identifiers, there is some confusion about what the various types of identifiers are in concept and practice (e.g. UUID, GUID, ORCID), including from a regulatory and ethical perspective. Therefore, the REWG supports the idea of working with the other Working Groups to develop a glossary or short document that explains what each of these terms means in concept and practice, and how they may impact on the work of the Global Alliance. Not only would this document better frame the discussions currently being had within projects, Working Groups, Task Teams, and Member Organizations, but it would also aid the genomic and clinical data sharing community more broadly. Indeed, as identifiers (especially universal identifiers) are usually assigned by some agency or consortium with governmental or international oversight, this may be a space where the Alliance wants to set leading standards for identifiers for genomic and health-related data. To do so, it is necessary to speak the same language and understand what these various terms mean.
Future ideas for possible development include:
The major product of the group has been the Security Infrastructure document (to be released as v1.0, with an open link for comments). Other Task Teams are currently developing a need for SWG input.
The Working Group decided to establish three new Task Teams:
Future Goals and Points of Agreement
The Global Alliance currently cannot accept responsibility for security oversight (corresponding to Control Objectives 4 and 5 in the Security Infrastructure). Two proposed models were proposed to address this:
Response to and mitigation of security breaches needs to be developed, including:
A technology solution for consent management also needs to be developed. Toward this objective, Eve Maler, Chair of the Katara User Managed Access (UMA) Work Group, presented ongoing work to define a profile of the OAuth 2.0 authorization standard (IETF RFC 6749) that enables individuals to authorize access to resources that they own.
Going forward, the SWG will launch three new Task Teams focusing on Software Security, Incident Response, and Cloud Security. In addition, interaction with the Task Teams of the other GA4GH Working Groups will increase. The interaction will be achieved in several ways:
In the near future, form the three new Task Teams, drawing from the SWG Interest Group as appropriate.
The four current Working Groups are catalyzing key collaborative projects that aim to share real world data. The Project Teams move their work forward autonomously, with varying levels of coordination support and oversight from the Global Alliance, and drawing on expertise from the Working Groups as required. It is a considerable achievement that these projects have been initialized and are moving forward through uncharted territory as we continue to define relationships and support mechanisms between key projects and the Global Alliance.
Working Group Coordinators will play a connecting role in ensuring that key project needs are addressed in a timely manner.