22 June 2020
On Wednesday, June 17, the GA4GH Data Use and Researcher Identity (DURI) Work Stream hosted the webinar “GA4GH Passports: Benefits of Integrating a Global Electronic ID for Accessing Biomedical Data.” More than 80 individuals tuned in to learn about GA4GH Passports, a standard for digitally representing researcher identities and permissions, which promotes consistency and scalability for the data access process. Craig Voisin of Google and Mikael Linden of ELIXIR—heads of the Passports development team—shared current data access challenges, use cases, and ways in which the Passports standard addresses the challenges faced in current data access ecosystems. Presentation slides and a recording of the webinar are available online.
Voisin began the webinar with an overview of the types of data access governing datasets today. Public access datasets are open to the world; anyone can discover or access these datasets. On the other side of the spectrum is controlled access, where datasets are under the strictest controls. Registered access lies between the two and offers a middle ground approach where individuals go through a vetting process, such as agreeing to ethics terms, to discover and access datasets.
Voisin and Linden discussed a series of challenges that currently exist in the data access process. These include:
To address these challenges, GA4GH Passports allows researchers to transport their identity and access qualifications between organizations and computing environments. The standard was built to be interoperable: different data controllers and hosts can understand the same “language” to authorize and enable access. GA4GH Passports aim to unify identities across systems and organizations, establish verifiable permissions, and streamline existing processes while improving security and regulatory compliance oversight.
Drilling down further, Voisin described how the Passport is composed of “visas,” which denote a researcher’s qualifications and permissions. These are securely collected from sources of truth, are verifiable, and time-limited. Some visas may also represent group or role membership—for example, what members belonging to a certain institution or organization are allowed to do with a dataset.
Next, Voisin and Linden dove into the three phases of the data access process—Discovery, the Data Access Compliance Office (DACO) process, and Data Access and Use—and specific challenges and opportunities with each step.
During the Discovery Phase, researchers may face challenges discovering potential datasets for their studies or determining if a dataset is worth applying for. Linden shared a potential light-weight solution using registered access Beacons, built upon the GA4GH Beacon specification. Once authorized and authenticated via their GA4GH Passport, researchers can query a Beacon Network to learn more information about the dataset.
In the DACO phase, researchers go through an approval process to gain access to the data. Information on the researcher and their particular study flows to the Data Access Committee (DAC), which evaluates the request. While much of the DACO process is manual, Voisin shared how Passports can help verify a researcher’s role and affiliation and securely transfer data access agreement sign-offs. This would allow DAC members to focus on whether the research is fit-for-use, ethically sound, reducing the burden of verifying inputs and approvals manually.
Once researchers are approved, they enter the Data Access and Use Phase. Voisin and Linden described two approaches to how researchers can now work with this data:
Voisin shared many benefits to bringing compute to data, including security improvements, overcoming jurisdictional challenges, cost and time savings, and the ability to use advanced tools at scale.
“By having more computing power and more datasets available globally , these advanced tools have the ability to offer higher quality analysis—enabling researchers to advance our understanding of human health and disease,” said Voisin.
Linden then described an example of bringing compute to data within ELIXIR. An authorization mechanism known as ELIXIR Authentication and Authorization Infrastructure (AAI) serves as the “passport broker” to provide information on datasets within the European Genome-Phenome Archive. Once a researcher has found and gained approval for a dataset, they can use their passport to access the cloud environment and run their analyses on the data.
Linden noted, “If you design a system that brings compute to the data, identity becomes the key—it’s the glue that holds together the home organization and affiliation of the researcher with the appropriate data access agreement and controlled access grant so that the access control can be enforced properly in the computing environment.”
Bringing all the pieces together, Voisin and Linden discussed how GA4GH Passports can be integrated into current data access ecosystems. These include wrapping existing permissions and attestations into the Passport visa format, adding a Passport Broker mechanism that can securely transfer visa information, and layering in a Passport control service to check access. The goal is data unification while leveraging existing infrastructure—a seamless way to access data across environments through a unified passport.
Lastly, the team presented current considerations and future directions for the GA4GH Passports standard, requesting feedback from the community. Melissa Konopko, who manages the DURI Work Stream, invited viewers to get involved with the development and implementation of Passports by accessing the GA4GH Passports Landing Page or reaching out directly.
To learn more about GA4GH Passports, view Part 2 of this webinar series—Implementing GA4GH Passports and AAI: Technical Deep Dive—where the team will delve into the technical details of implementing GA4GH Passports alongside the GA4GH AAI Specification.