Your browser doesn't support javascript.
loading
RETRACTED: LFQC: a lossless compression algorithm for FASTQ files.
Pathak, Sudipta; Rajasekaran, Sanguthevar.
Afiliação
  • Pathak S; Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, USA.
  • Rajasekaran S; Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-4155, USA.
Bioinformatics ; 35(9): e1-e7, 2019 05 01.
Article em En | MEDLINE | ID: mdl-31051040
ABSTRACT
MOTIVATION Next-generation sequencing (NGS) technologies have revolutionized genomic research by reducing the cost of whole-genome sequencing. One of the biggest challenges posed by modern sequencing technology is economic storage of NGS data. Storing raw data is infeasible because of its enormous size and high redundancy. In this article, we address the problem of storage and transmission of large Fastq files using innovative compression techniques.

RESULTS:

We introduce a new lossless non-reference-based fastq compression algorithm named lossless FastQ compressor. We have compared our algorithm with other state of the art big data compression algorithms namely gzip, bzip2, fastqz, fqzcomp, G-SQZ, SCALCE, Quip, DSRC, DSRC-LZ etc. This comparison reveals that our algorithm achieves better compression ratios. The improvement obtained is up to 225%. For example, on one of the datasets (SRR065390_1), the average improvement (over all the algorithms compared) is 74.62%. AVAILABILITY AND IMPLEMENTATION The implementations are freely available for non-commercial purposes. They can be downloaded from http//engr.uconn.edu/∼rajasek/FastqPrograms.zip.

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2019 Tipo de documento: Article