CRAM

Uses data compression strategies to efficiently store genomic data

As genomic sequencing increases around the globe, it becomes vital to store the generated data efficiently and sustainably. Maintained by the GA4GH Large Scale Genomics (LSG) Work Stream, the CRAM file format tackles this challenge by efficiently storing genomic data. CRAM can help reduce disk space and storage costs by 30 to 50 per cent, accurately track reference genomes, improve integration within the field, and make data easy to transfer and share with collaborators.

Jump to...

Benefits

  • Significantly reduces the size of genomic sequencing data through compression
  • Reduces storage costs and fosters faster data sharing with collaborators
  • Provides random access by genomic region and also selectively by data type

Target users

Researchers, and data custodians

Community resources

Dive deeper into this product! CRAM is a file format that uses various algorithms to compress the data it stores. Some of these algorithms are universal, while others leverage the unique fact that most human genomes are very similar to the reference human genome. CRAM keeps files small by only storing the parts of a sequence that are different to the reference. CRAM’s column-oriented format also allows users to extract information efficiently from particular subsets of the file on particular chromosomes — one of the major use cases involving genomic data.


Date

Title

Info

15 Feb 2023
Please review and provide your feedback for CRAM v3.1 and refget v2.0 by 14 March 2023.

Title

Info

Repeat

Day

Time

Duration

This group meets to discuss all GA4GH File Formats maintained by the Large-Scale Genomics Work Stream: SAM/BAM/CRAM and VCF/BCF.

Every Two Months
Tuesday
00:00 UTC
1 Hour

This group meets to discuss all GA4GH File Formats maintained by the Large-Scale Genomics Work Stream: SAM/BAM/CRAM and VCF/BCF.

Every Two Months
Tuesday
20:00 UTC
1 Hour

Date

Version

15 Aug 2022
22 Jul 2020
N.A.

Title

Related Driver Projects and Organisations

All of Us Research Program
ICGC ARGO
Japan Agency for Medical Research and Development (AMED)
Canadian Distributed Infrastructure for Genomics (CanDIG)
European Joint Programme on Rare Disease (EJP RD)
Human Heredity and Health in Africa (H3Africa)
Japan Agency for Medical Research and Development (AMED)
Trans-Omics for Precision Medicine (TOPMed)
GEnome Medical alliance Japan (GEM Japan)
Japan Agency for Medical Research and Development (AMED)

Don't see your name? Get in touch:

  • James Bonfield
    Wellcome Sanger Institute (WSI)
  • Daniel Cameron
    Walter and Eliza Hall Institute of Medical Research
  • Shu Hui Chen
    NIH National Heart, Lung, and Blood Institute (NHLBI)
  • Guy Cochrane
    Independent Contributor
  • Robert Davies
    Wellcome Sanger Institute (WSI)
  • Muhammad Haseeb
    EMBL's European Bioinformatics Institute (EBI)
  • John Marshall
    University of Glasgow
  • Arshiya Merchant
     
  • Martin Pollard
    Wellcome Sanger Institute (WSI)

News, events, and more

Catch up with all news and articles associated with CRAM.

8 Jul 2021
GA4GH standards in a global learning health system
See more
1 Oct 2019
Guest post: seven myths about CRAM — the community standard for genomic data compression
See more
#CRAM4GH Twitter Chat
9 Apr 2019
#CRAM4GH Twitter chat: recap
See more