htsget

Allows users to download read and variation data for subsections of the genome

With great advances in sequencing technologies, research and healthcare settings are generating more data than ever before. The ability to share and access these large volumes of data to enable new scientific discoveries, however, remains a challenge. Today, this is largely achieved by transferring and copying large files between two computers, resulting in a slow and resource-intensive process that is unscalable. Maintained by the GA4GH Large Scale Genomics (LSG) Work Stream, the htsget API enables faster retrieval of read and variation data by allowing users to download subsections of the genome in which they are interested.

Jump to...

Benefits

  • Allows users to download only parts of a genome of interest
  • Enables faster and more efficient access to data

Target users

Researchers

Image summary: htsget helps researchers “stream” portions of data from a genome, rather than downloading the entire file.
THEME
CATEGORY
TYPE
STATUS
Work Stream
LATEST VERSION
Product Lead
  • Mike Lin
Staff Contact
Tools & Platforms

Community resources

Dive deeper into this product! Sharing large volumes of genomic data across different locations can enable the discovery of new genetic associations and provide supporting evidence for new findings. The htsget API provides a secure, consistent protocol for researchers to access data stored in different repositories — whether based in big public clouds or in more traditional infrastructure. Moving away from a file-centric approach to data sharing, htsget allows for a more flexible, efficient, and specific data access approach — allowing users to download subsections of the genome instead of the entire file.


Title

Info

Repeat

Day

Time

Duration

Working meeting focussed on focussed on Rust implementation work and benchmarking implementations.

Meets every eight weeks on a Wednesday at 5:00pm BST. Alternates with the other htsget meeting, which meets every eight weeks on a Tuesday at 10:00pm BST.

Every Two Months
Wednesday
17:00 UTC
1 Hour

Working meeting focused on focused on Rust implementation work and benchmarking implementations.

Meets every eight weeks on a Tuesday at 10:00pm BST. Alternates with the other htsget meeting, which takes place every eight weeks on a Wednesday at 5:00pm BST.

Every Two Months
Tuesday
22:00 UTC
1 Hour

Date

Version

24 May 2022
13 May 2019
23 Jan 2019
1 Jun 2018
10 Oct 2017

Title

Related Driver Projects and Organisations

All of Us Research Program
Canadian Distributed Infrastructure for Genomics (CanDIG)
European Joint Programme on Rare Disease (EJP RD)
ELIXIR Beacon
ENA / EVA / EGA, EMBL's European Bioinformatics Institute (EBI), Centre for Genomic Regulation

Don't see your name? Get in touch:

  • Jeremy Adams
    DNAstack
  • Edmon Begoli
    Oak Ridge National Laboratory (ORNL)
  • Robert Davies
    Wellcome Sanger Institute (WSI)
  • Mallory Freeberg
    EMBL's European Bioinformatics Institute (EBI)
  • David Glazer
    Verily
  • Oliver Hofmann
    University of Melbourne Centre for Cancer Research
  • David Jackson
    Wellcome Sanger Institute (WSI)
  • Jerome Kelleher
    University of Oxford
  • Anders Leung
    Independent Contributor
  • Mike Lin
    Independent Contributor
  • John Marshall
    University of Glasgow
  • Shaikh Farhan Rashid
    University Health Network, Canadian Distributed Infrastructure for Genomics (CanDIG)
  • Augusto Rendon
    Genomics England
  • Roman Valls Guimera
    University of Melbourne Centre for Cancer Research

News, events, and more

Catch up with all news and articles associated with htsget.

8 Jul 2021
GA4GH standards in a global learning health system
See more
5 Apr 2021
GA4GH shares seven open-source projects as part of Google Summer of Code 2021
See more
htsget
2 Dec 2019
Genomics England implements GA4GH API to provide secure access to genomic data for the NHS
See more