Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Our Strategic Road Map defines strategies, standards, and policy frameworks to support responsible global use of genomic and related health data.
Discover how a meeting of 50 leaders in genomics and medicine led to an alliance uniting more than 5,000 individuals and organisations to benefit human health.
GA4GH Inc. is a not-for-profit organisation that supports the global GA4GH community.
To guide our collaborative, globe-spanning alliance, GA4GH relies on a Standards Steering Committee and an Executive Committee.
The Funders Forum brings together organisations that offer both financial support and strategic guidance.
The EDI Advisory Group responds to issues raised in the GA4GH community, finding equitable, inclusive ways to build products that benefit diverse groups.
Distributed across four Host Institutions, our staff team supports the mission and operations of GA4GH.
Curious who we are? Meet the people and organisations across six continents who make up GA4GH.
More than 500 organisations connected to genomics — in healthcare, research, patient advocacy, industry, and beyond — have signed onto the mission and vision of GA4GH as Organisational Members.
These core Organisational Members are genomic data initiatives that have committed resources to guide GA4GH work and pilot our products.
This subset of Organisational Members whose networks or infrastructure align with GA4GH priorities has made a long-term commitment to engaging with our community.
Local and national organisations assign experts to spend at least 30% of their time building GA4GH products.
Anyone working in genomics and related fields is invited to participate in our inclusive community by creating and using new products.
Wondering what GA4GH does? Learn how we find and overcome challenges to expanding responsible genomic data use for the benefit of human health.
Study Groups define needs. Participants survey the landscape of the genomics and health community and determine whether GA4GH can help.
Work Streams create products. Community members join together to develop technical standards, policy frameworks, and policy tools that overcome hurdles to international genomic data use.
GIF solves problems. Organisations in the forum pilot GA4GH products in real-world situations. Along the way, they troubleshoot products, suggest updates, and flag additional needs.
NIF finds challenges and opportunities in genomics at a global scale. National programmes meet to share best practices, avoid incompatabilities, and help translate genomics into benefits for human health.
Communities of Interest find challenges and opportunities in areas such as rare disease, cancer, and infectious disease. Participants pinpoint real-world problems that would benefit from broad data use.
See all our products — always free and open-source. Do you work on cloud genomics, data discovery, user access, data security or regulatory policy and ethics? Need to represent genomic, phenotypic, or clinical data? We’ve got a solution for you.
All GA4GH standards, frameworks, and tools follow the Product Development and Approval Process before being officially adopted.
Learn how other organisations have implemented GA4GH products to solve real-world problems.
Help us transform the future of genomic data use! See how GA4GH can benefit you — whether you’re using our products, writing our standards, subscribing to a newsletter, or more.
Help create new global standards and frameworks for responsible genomic data use.
Align your organisation with the GA4GH mission and vision.
Solve your real-world data problems with support from this valuable network of global institutions.
Work with like-minded groups committed to better data use in areas like rare disease, cancer, and infectious disease.
Share your thoughts on all GA4GH products currently open for public comment.
Solve real problems by aligning your organisation with the world’s genomics standards. We offer software dvelopers both customisable and out-of-the-box solutions to help you get started.
Learn more about upcoming GA4GH events. See reports and recordings from our past events.
Speak directly to the global genomics and health community while supporting GA4GH strategy.
Be the first to hear about the latest GA4GH products, upcoming meetings, new initiatives, and more.
Questions? We would love to hear from you.
Read news, stories, and insights from the forefront of genomic and clinical data use.
Attend an upcoming GA4GH event, or view meeting reports from past events.
See new projects, updates, and calls for support from the Work Streams.
Read academic papers coauthored by GA4GH contributors.
Listen to our podcast OmicsXchange, featuring discussions from leaders in the world of genomics, health, and data sharing.
Check out our videos, then subscribe to our YouTube channel for more content.
View the latest GA4GH updates, Genomics and Health News, Implementation Notes, GDPR Briefs, and more.
Discover all things GA4GH: explore our news, events, videos, podcasts, announcements, publications, and newsletters.
29 Apr 2022
The latest GDPR Brief, written by Melissa Cline, addresses how a well-designed federated analysis mechanism can enable responsible data sharing that complies with the GDPR.
The General Data Protection Regulation (GDPR) presents a number of restrictions on how organizations both within and outside of the European Union (E.U.) may process (i.e. collect, use and share) personal data, which is defined as data that relates to “an identified or identifiable person”. While these restrictions present obstacles to sharing genomic and health data, federated analysis can offer a solution. Traditional data sharing involves data providers sending a copy of their data to data recipients, who analyze the data at their home institutions (“bringing the data to the code.”) Federated analysis, conversely, involves “bringing the code to the data”, with requesting parties submitting a copy of their analysis software to the data, and with the data not shared beyond the host institution. Federated analysis shares the aggregate, group-level results of data analysis amongst collaborating institutions, without revealing the individual-level personal data used to perform this analysis. Therefore, federated data analysis enables research institutions to engage in collaborative data analysis without engaging in the exchange of personal biomedical data, which may facilitate GDPR compliance. For instance, this could in some instances reduce the number of participants in data analysis which the law considers to be joint data controllers.
Two examples are the Beacon network and the Matchmaker Exchange, in which a number of different organizations host services that allow specific queries of their data, enabling the discovery of cases presenting a rare variant or symptoms suggesting a rare disease.
Often, there is one institution that oversees data coordination by organizing and issuing data requests and collating and disseminating the results. This approach is used in the CanDIG and CINECA networks. Within these networks, software containers are shared with the partner organizations, each of which apply them to their internal data within their secure institutional environments, generating anonymized data or contributing to its de-identification. The analysis results can then be shared across the network, and are also harmonized to common technical standards. The GA4GH encourages federation for the sharing of data that “cannot move for technical or legal reasons”.
For aggregated data to not be regulated as personal data, there must no available means to infer the identities of the underlying individuals in the group from the aggregate results, that is reasonably likely to be used. This is not necessarily true for aggregate data about rare diseases or rare genetic variants, which might plausibly be observed in just one person. If data are organized according to demographic traits such as ethnicity or age bracket, it might be possible to infer the identities of the concerned individuals from a unique combination of demographic traits belonging to them. As such, aggregated data are not always anonymized. As with personal data, the privacy of aggregated data or data that have been de-identified are best evaluated with a contextual risk-based approach. Within data science, the field of “statistical disclosure control” (SDC) offers an expansive literature and a breadth of methods for reducing the risk of disclosing personal data in data sharing, in balance with ensuring that the data to be shared remains informative.
Using federated analysis methods instead of disclosing identifiable personal data also provides other advantages in the alignment of research priorities and GDPRcompliance. For example, using federated data analysis methodologies that limit the processing of personal data can facilitate compliance with the Data Minimization principle established in Article 5 of GDPR.
A well-designed federated analysis strategy that leverages open-source software can promote the safety and integrity of the research process through in enhancing the reproducibility and transparency of the output results.
In summary, while federated analysis alone does not guarantee compliance with data protection law, a well-designed federated analysis mechanism can enable responsible data sharing that is in compliance with the GDPR.
Relevant GDPR Provisions
Melissa Cline is a Program Manager at the University of California Santa Cruz.
See all previous briefs.
Please note that GDPR Briefs neither constitute nor should be relied upon as legal advice. Briefs represent a consensus position among Forum Members regarding the current understanding of the GDPR and its implications for genomic and health-related research. As such, they are no substitute for legal advice from a licensed practitioner in your jurisdiction.