Your browser doesn't support javascript.
loading
Figbird: a probabilistic method for filling gaps in genome assemblies.
Tarafder, Sumit; Islam, Mazharul; Shatabda, Swakkhar; Rahman, Atif.
Afiliación
  • Tarafder S; Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
  • Islam M; Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh.
  • Shatabda S; Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.
  • Rahman A; Department of Computer Science and Engineering, United International University, Dhaka 1212, Bangladesh.
Bioinformatics ; 38(15): 3717-3724, 2022 08 02.
Article en En | MEDLINE | ID: mdl-35731219
ABSTRACT
MOTIVATION Advances in sequencing technologies have led to the sequencing of genomes of a multitude of organisms. However, draft genomes of many of these organisms contain a large number of gaps due to the repeats in genomes, low sequencing coverage and limitations in sequencing technologies. Although there exists several tools for filling gaps, many of these do not utilize all information relevant to gap filling.

RESULTS:

Here, we present a probabilistic method for filling gaps in draft genome assemblies using second-generation reads based on a generative model for sequencing that takes into account information on insert sizes and sequencing errors. Our method is based on the expectation-maximization algorithm unlike the graph-based methods adopted in the literature. Experiments on real biological datasets show that this novel approach can fill up large portions of gaps with small number of errors and misassemblies compared to other state-of-the-art gap-filling tools. AVAILABILITY AND IMPLEMENTATION The method is implemented using C++ in a software named 'Filling Gaps by Iterative Read Distribution (Figbird)', which is available at https//github.com/SumitTarafder/Figbird. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Programas Informáticos / Secuenciación de Nucleótidos de Alto Rendimiento Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: Bangladesh

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Programas Informáticos / Secuenciación de Nucleótidos de Alto Rendimiento Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: Bangladesh