Distance-based phylogenetic inference from typing data: a unifying view.

Vaz, Cátia; Nascimento, Marta; Carriço, João A; Rocher, Tatiana; Francisco, Alexandre P

Vaz, Cátia; Nascimento, Marta; Carriço, João A; Rocher, Tatiana; Francisco, Alexandre P.

Affiliation

Vaz C; Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, and a researcher at INESC-ID.
Nascimento M; Instituto Superior Técnico, Universidade de Lisboa and a researcher at INESC-ID.
Carriço JA; Faculdade de Medicina, Instituto de Microbiologia and Instituto de Medicina Molecular, Universidade de Lisboa.
Rocher T; INESC-ID.
Francisco AP; Instituto Superior Técnico, Universidade de Lisboa, and a researcher at INESC-ID.

Brief Bioinform ; 22(3)2021 05 20.

Article in En | MEDLINE | ID: mdl-32734294

ABSTRACT

ABSTRACT

Typing methods are widely used in the surveillance of infectious diseases, outbreaks investigation and studies of the natural history of an infection. Moreover, their use is becoming standard, in particular with the introduction of high-throughput sequencing. On the other hand, the data being generated are massive and many algorithms have been proposed for a phylogenetic analysis of typing data, addressing both correctness and scalability issues. Most of the distance-based algorithms for inferring phylogenetic trees follow the closest pair joining scheme. This is one of the approaches used in hierarchical clustering. Moreover, although phylogenetic inference algorithms may seem rather different, the main difference among them resides on how one defines cluster proximity and on which optimization criterion is used. Both cluster proximity and optimization criteria rely often on a model of evolution. In this work, we review, and we provide a unified view of these algorithms. This is an important step not only to better understand such algorithms but also to identify possible computational bottlenecks and improvements, important to deal with large data sets.

Subject(s)

Algorithms; Databases, Nucleic Acid; Evolution, Molecular; High-Throughput Nucleotide Sequencing; Models, Genetic; Phylogeny

Key words

clustering methods; phylogenetic inference; tree search algorithms

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Phylogeny / Algorithms / Evolution, Molecular / Databases, Nucleic Acid / High-Throughput Nucleotide Sequencing / Models, Genetic Type of study: Prognostic_studies Language: En Journal: Brief Bioinform Journal subject: BIOLOGIA / INFORMATICA MEDICA Year: 2021 Document type: Article

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google