Error filtering, pair assembly and error correction for next-generation sequencing reads.

Edgar, Robert C; Flyvbjerg, Henrik

Edgar, Robert C; Flyvbjerg, Henrik.

Afiliação

Edgar RC; Tiburon, CA 94920, USA and.
Flyvbjerg H; Department of Micro- and Nanotechnology, Technical University of Denmark, DK-2800 Lyngby, Denmark.

Bioinformatics ; 31(21): 3476-82, 2015 Nov 01.

Article em En | MEDLINE | ID: mdl-26139637

ABSTRACT

ABSTRACT

MOTIVATION Next-generation sequencing produces vast amounts of data with errors that are difficult to distinguish from true biological variation when coverage is low.

RESULTS:

We demonstrate large reductions in error frequencies, especially for high-error-rate reads, by three independent means (i) filtering reads according to their expected number of errors, (ii) assembling overlapping read pairs and (iii) for amplicon reads, by exploiting unique sequence abundances to perform error correction. We also show that most published paired read assemblers calculate incorrect posterior quality scores. AVAILABILITY AND IMPLEMENTATION These methods are implemented in the USEARCH package. Binaries are freely available at http//drive5.com/usearch. CONTACT robert@drive5.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala/métodos; Algoritmos; Software

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2015 Tipo de documento: Article