Your browser doesn't support javascript.
loading
GABAC: an arithmetic coding solution for genomic data.
Voges, Jan; Paridaens, Tom; Müntefering, Fabian; Mainzer, Liudmila S; Bliss, Brian; Yang, Mingyu; Ochoa, Idoia; Fostier, Jan; Ostermann, Jörn; Hernaez, Mikel.
Afiliação
  • Voges J; Institut für Informationsverarbeitung (TNT), Leibniz University Hannover, 30167 Hannover, Germany.
  • Paridaens T; IDLab, Ghent University-imec, 9050 Ghent, Belgium.
  • Müntefering F; Institut für Informationsverarbeitung (TNT), Leibniz University Hannover, 30167 Hannover, Germany.
  • Mainzer LS; National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
  • Bliss B; Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
  • Yang M; National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
  • Ochoa I; Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
  • Fostier J; Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
  • Ostermann J; IDLab, Ghent University-imec, 9050 Ghent, Belgium.
  • Hernaez M; Institut für Informationsverarbeitung (TNT), Leibniz University Hannover, 30167 Hannover, Germany.
Bioinformatics ; 36(7): 2275-2277, 2020 04 01.
Article em En | MEDLINE | ID: mdl-31830243
ABSTRACT
MOTIVATION In an effort to provide a response to the ever-expanding generation of genomic data, the International Organization for Standardization (ISO) is designing a new solution for the representation, compression and management of genomic sequencing data the Moving Picture Experts Group (MPEG)-G standard. This paper discusses the first implementation of an MPEG-G compliant entropy codec GABAC. GABAC combines proven coding technologies, such as context-adaptive binary arithmetic coding, binarization schemes and transformations, into a straightforward solution for the compression of sequencing data.

RESULTS:

We demonstrate that GABAC outperforms well-established (entropy) codecs in a significant set of cases and thus can serve as an extension for existing genomic compression solutions, such as CRAM. AVAILABILITY AND IMPLEMENTATION The GABAC library is written in C++. We also provide a command line application which exercises all features provided by the library. GABAC can be downloaded from https//github.com/mitogen/gabac. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Compressão de Dados / Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Compressão de Dados / Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2020 Tipo de documento: Article