7bgzf: Replacing samtools bgzip deflation for archiving and real-time compression.
Comput Biol Chem
; 85: 107207, 2020 Apr.
Article
em En
| MEDLINE
| ID: mdl-32092548
ABSTRACT
BACKGROUND:
Genomic sequence data are not only massive but also increasing rapidly every day; therefore, it is essential to compress such data for sharing. Though there are some specific compressors, they lack interoperability. In this study, a SAMtools bgzip variant named 7bgzf has been developed, incorporating several compression and deflation algorithms other than the widely used zlib algorithm. An extensive benchmarking study has been carried out with available data compression software.RESULTS:
On both x64 and ARM machines, igzip performed very rapidly. For high compression, using libdeflate on the x64 platform achieved high compression with tolerable speed loss.CONCLUSIONS:
Based on appropriate algorithm selection, the proposed compression method performed better than the original bgzip method while maintaining interoperability with existing software. Therefore, this software is useful for both distribution of genomic sequence archives and real-time compression in mobile computing.Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Software
/
Biologia Computacional
/
Compressão de Dados
Idioma:
En
Revista:
Comput Biol Chem
Assunto da revista:
BIOLOGIA
/
INFORMATICA MEDICA
/
QUIMICA
Ano de publicação:
2020
Tipo de documento:
Article