Whole Genome Sequencing (WGS) Quality Control (QC) Standards approved as an official GA4GH product

13 Nov 2025

Developed within the Large-Scale Genomics Work Stream, the Whole Genome Sequencing (WGS) Quality Control (QC) Standards establish global best practices for consistent, reliable, and comparable genomic data quality across institutions. 

 

Whole Genome Quality Control Standard

By Jaclyn Estrin, GA4GH Senior Science Writer

The Global Alliance for Genomics and Health (GA4GH) is pleased to announce the recent product approval of the Whole Genome Sequencing (WGS) Quality Control (QC) Standards.

Developed within the Large-Scale Genomics (LSG) Work Stream, with contributions from the Genomic Knowledge Standards (GKS) Work Stream, the WGS QC standards establish a unified framework for assessing the quality of whole genome sequencing data. Product development was led by Justin Jeyakani, Maxime Hebrard, and Nicolas Bertin of Precision Health Research, Singapore (PRECISE), under the guidance of LSG Work Stream Manager, Reggan Thomas (EMBL’s European Bioinformatics Institute).

Within the last two decades, there has been an exponential growth in the number of global initiatives conducting whole genome sequencing. The data generated holds the potential to inform a greater understanding of human health and disease. When this data is shared across institutions at a global scale, there is immense promise to drive research progress, speed up patient diagnoses, and advance human health outcomes.

Jeyakani underscored the challenge that drove the product development. He said, “Despite increasing efforts to share whole genome sequencing (WGS) data across research and clinical initiatives, the lack of standardised quality control (QC) definitions and methodologies remains a major barrier. Variability in data production processes, inconsistent implementation of QC metrics across analytical tools, and the absence of a unified QC framework hinder the comparison, integration, and reuse of WGS datasets. As a result, researchers are often forced to reprocess or independently verify data quality, a time consuming and costly effort that limits cross-study analysis, clinical decision making, and global data harmonisation.”

To address this challenge, GA4GH has developed the Whole Genome Sequencing Quality Control Standards — a structured set of formally defined QC metrics, reference implementations, and usage guidelines for short-read germline WGS data. 

“By establishing a common foundation for quality assessment and reporting, these standards aim to improve interoperability, reduce redundant effort, and increase confidence in the integrity and comparability of WGS data across institutions and applications,” said Jeyakani. “These standards improve trust, save time, and reduce ambiguity. They standardise what is being measured and how.”

The product includes three core components, including:

  1. standardised quality control metric definitions for metadata, schema, and file formats to enable shareability and reduce ambiguity;
  2. flexible and scalable reference implementation, including an example quality control workflow to demonstrate practical application of the standard;
  3. benchmarking resources, including standardised unit tests and benchmarking datasets to validate reference implementations and alternatives, as well as assess computational resources.

Early implementers of the standard include Precision Health Research, Singapore (PRECISE) and the International Cancer Genome Consortium (ICGC) ARGO project, demonstrating applicability across both national programmes and large-scale international studies. The WGS QC standards provide a strong foundation for global genomics research, and implementers can adapt the product to their own study or clinical context.

Widespread implementation of these standards can allow for increased ability to compare data across research projects, scale data integration from multiple sources, and build trust in the integrity of shared genomic data. Together, these impacts can drive a greater ability to share data across institutions, borders, and systems.

The product team is now working to integrate the product with others in the GA4GH ecosystem, such as Data Connect, to enhance alignment between standards. 

They are also working to support other large repositories and reference genomes for product uptake and implementation. This includes plans to align the product with the International Organization for Standardization (ISO) and collaborate with large genomic databases.

Looking forward, the team aims to expand the product scope to include long-read sequencing and somatic mutation pipelines. They aim to ensure that the product remains relevant and applicable to new developments and emerging technologies within the genomics field.

“Good quality WGS data is the foundation of reliable analysis and good science,” said Bertin. “These standards ensure that quality is measurable, consistent, and trusted, no matter where or how it was generated, empowering global genomics collaboration.”

Latest News

News
13 Nov 2025
The Experiments Metadata Checklist is an approved GA4GH product
See more
Whole Genome Quality Control Standard
News
13 Nov 2025
Whole Genome Sequencing (WGS) Quality Control (QC) Standards approved as an official GA4GH product
See more