Diversity in Datasets

Aims to develop a policy framework that addresses how to promote global diversity in datasets within genomic research

If we want all people to truly benefit from scientific advancement and the full potential of genomics, we need to use diverse datasets for research and clinical care. But at all stages of genomic research, we see a critical lack of dataset diversity — from research participation and recruitment, to the genomic workforce, to emerging techniques and approaches such as polygenic risk scores and machine learning. Thus, the GA4GH Regulatory & Ethics Work Stream (REWS) aims to define concepts that underpin this topic (such as “diversity” and “representation”) and develop actionable recommendations for researchers that can uphold diversity in their research and findings.

Jump to...


  • Aims to develop guidance on how to best promote diverse datasets in associated research
  • Aims to promote an international lens on meaningful diversity in datasets, a topic which is often limited to national discussion

Target users


Community resources

Dive deeper into this product! Diversity in datasets being used for research and clinical care is key to realising the full potential of genomics and ensuring that scientific findings and advancement can truly benefit all people. However, there is a critical lack of diversity in datasets used throughout all stages of genomic research. To address this challenge, guidance currently under development will indicate how researchers can promote diversity in datasets throughout the genomic research process. Topics include assessing diversity in data acquisition, considering usability of a dataset, measuring diversity in research, and promoting transparency in use, research, and publication of research.







Working meeting to promote and discuss the meaning of diversity in genomic datasets

1 Hour

Don't see your name? Get in touch:

  • Mutiat Afolabi
    Wellcome Sanger Institute (WSI)
  • Shu Hui Chen
    NIH National Heart, Lung, and Blood Institute (NHLBI)
  • Megan Doerr
    Sage Bionetworks
  • Tina Hernandez-Boussard
    Stanford University
  • Jacob Shujui Hsu
    National Taiwan University
  • Sumit Jamuar
    Global Gene Corp
  • Saumya Jamuar
    KK Women's and Children's Hospital
  • Beatrice Kaiser
    McGill University / Université McGill, Centre of Genomics and Policy
  • Anna Lewis
    Harvard University
  • Zane Lombard
    University of the Witwatersrand, National Health Laboratory Service
  • Maxine Mackintosh
    Genomics England
  • Maili Raven-Adams
    The Nuffield Council on Bioethics
  • Alham Saadat
    Broad Institute of MIT and Harvard
  • Sikha Singh
    Association of Public Health Laboratories
  • Diya Uberoi
    McGill University / Université McGill, Centre of Genomics and Policy

News, events, and more

Catch up with all news and articles associated with Diversity in Datasets.

A DNA strand extending across a blue background, filled with molecular structures and more DNA
28 May 2024
GA4GH submits comments on the WHO’s draft principles for human genome access, use, and sharing
See more
25 Mar 2022
OmicsXchange episode 14: genomic surveillance and outbreak response in Africa with Alan Christoffels
See more
10 Dec 2020
OmicsXchange episode 12: the need for further inclusion of diversity in studies — an interview with Nicola Mulder
See more