Your browser doesn't support javascript.
loading
Improved Node and Arc Multiplicity Estimation in De Bruijn Graphs Using Approximate Inference in Conditional Random Fields.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 1995-2006, 2023.
Article em En | MEDLINE | ID: mdl-37015543
ABSTRACT
In de novo genome assembly using short Illumina reads, the accurate determination of node and arc multiplicities in a de Bruijn graph has a large impact on the quality and contiguity of the assembly. The multiplicity estimates of nodes and arcs guide the cleaning of the de Bruijn graph by identifying spurious nodes and arcs that correspond to sequencing errors. Additionally, they can be used to guide repeat resolution. Here, we model the entire de Bruijn graph and the accompanying read coverage information with a single Conditional Random Field (CRF) model. We show that approximate inference using Loopy Belief Propagation (LBP) on our model improves multiplicity assignment accuracy within feasible runtimes. The order in which messages are passed has a large influence on the speed of LBP convergence. Little theoretical guarantees exist and the conditions for convergence are not easily checked as our CRF model contains higher-order interactions. Therefore, we also present an empirical evaluation of several message passing schemes that may guide future users of LBP on CRFs with higher-order interactions in their choice of message passing scheme.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Fadiga Tipo de estudo: Clinical_trials Limite: Humans Idioma: En Revista: ACM Trans Comput Biol Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de publicação: EEUU / ESTADOS UNIDOS / ESTADOS UNIDOS DA AMERICA / EUA / UNITED STATES / UNITED STATES OF AMERICA / US / USA

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Fadiga Tipo de estudo: Clinical_trials Limite: Humans Idioma: En Revista: ACM Trans Comput Biol Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de publicação: EEUU / ESTADOS UNIDOS / ESTADOS UNIDOS DA AMERICA / EUA / UNITED STATES / UNITED STATES OF AMERICA / US / USA