Your browser doesn't support javascript.
loading
Phase transition in the computational complexity of the shortest common superstring and genome assembly.
Fernandez, L A; Martin-Mayor, V; Yllanes, D.
Afiliação
  • Fernandez LA; Departamento de Física Teórica, Universidad Complutense, 28040 Madrid, Spain.
  • Martin-Mayor V; Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), 50018 Zaragoza, Spain.
  • Yllanes D; Departamento de Física Teórica, Universidad Complutense, 28040 Madrid, Spain.
Phys Rev E ; 109(1-1): 014133, 2024 Jan.
Article em En | MEDLINE | ID: mdl-38366408
ABSTRACT
Genome assembly, the process of reconstructing a long genetic sequence by aligning and merging short fragments, or reads, is known to be NP-hard, either as a version of the shortest common superstring problem or in a Hamiltonian-cycle formulation. That is, the computing time is believed to grow exponentially with the problem size in the worst case. Despite this fact, high-throughput technologies and modern algorithms currently allow bioinformaticians to handle datasets of billions of reads. Using methods from statistical mechanics, we address this conundrum by demonstrating the existence of a phase transition in the computational complexity of the problem and showing that practical instances always fall in the "easy" phase (solvable by polynomial-time algorithms). In addition, we propose a Markov-chain Monte Carlo method that outperforms common deterministic algorithms in the hard regime.

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Phys Rev E Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Espanha

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Phys Rev E Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Espanha