Your browser doesn't support javascript.
loading
Phylo2Vec: a vector representation for binary trees.
Penn, Matthew J; Scheidwasser, Neil; Khurana, Mark P; Duchêne, David A; Donnelly, Christl A; Bhatt, Samir.
Afiliación
  • Penn MJ; Department of Statistics, University of Oxford, Oxford, United Kingdom.
  • Scheidwasser N; Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark.
  • Khurana MP; Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark.
  • Duchêne DA; Department of Statistics, University of Oxford, Oxford, United Kingdom.
  • Donnelly CA; Department of Statistics, University of Oxford, Oxford, United Kingdom.
  • Bhatt S; Pandemic Sciences Institute, University of Oxford, Oxford, United Kingdom.
Syst Biol ; 2024 Jun 27.
Article en En | MEDLINE | ID: mdl-38935520
ABSTRACT
Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with n leaves to a unique integer vector of length n - 1. The advantages of Phylo2Vec are fourfold i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Revista: Syst Biol Asunto de la revista: BIOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Reino Unido

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Revista: Syst Biol Asunto de la revista: BIOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Reino Unido