Your browser doesn't support javascript.
loading
Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs.
Limasset, Antoine; Flot, Jean-François; Peterlongo, Pierre.
Affiliation
  • Limasset A; Evolutionary Biology & Ecology, Université Libre de Bruxelles (ULB), Bruxelles, Belgium.
  • Flot JF; Evolutionary Biology & Ecology, Université Libre de Bruxelles (ULB), Bruxelles, Belgium.
  • Peterlongo P; Interuniversity Institute of Bioinformatics in Brussels - (IB) 2, Brussels, Belgium.
Bioinformatics ; 36(5): 1374-1381, 2020 03 01.
Article de En | MEDLINE | ID: mdl-30785192
ABSTRACT
MOTIVATION Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large datasets or consider reads as mere suites of k-mers, without taking into account their full-length sequence information.

RESULTS:

We propose a new method to correct short reads using de Bruijn graphs and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond. AVAILABILITY AND IMPLEMENTATION The implementation is open source, available at http//github.com/Malfoy/BCOOL under the Affero GPL license and as a Bioconda package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Sujet(s)

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Algorithmes / Séquençage nucléotidique à haut débit Limites: Humans Langue: En Journal: Bioinformatics Sujet du journal: INFORMATICA MEDICA Année: 2020 Type de document: Article Pays d'affiliation: Belgique

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Algorithmes / Séquençage nucléotidique à haut débit Limites: Humans Langue: En Journal: Bioinformatics Sujet du journal: INFORMATICA MEDICA Année: 2020 Type de document: Article Pays d'affiliation: Belgique