Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Mol Phylogenet Evol ; 199: 108137, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39029549

RESUMEN

The Hybridization problem asks to reconcile a set of conflicting phylogenetic trees into a single phylogenetic network with the smallest possible number of reticulation nodes. This problem is computationally hard and previous solutions are limited to small and/or severely restricted data sets, for example, a set of binary trees with the same taxon set or only two non-binary trees with non-equal taxon sets. Building on our previous work on binary trees, we present FHyNCH, the first algorithmic framework to heuristically solve the Hybridization problem for large sets of multifurcating trees whose sets of taxa may differ. Our heuristics combine the cherry-picking technique, recently proposed to solve the same problem for binary trees, with two carefully designed machine-learning models. We demonstrate that our methods are practical and produce qualitatively good solutions through experiments on both synthetic and real data sets.


Asunto(s)
Algoritmos , Aprendizaje Automático , Filogenia , Modelos Genéticos , Hibridación Genética
2.
Bull Math Biol ; 84(8): 76, 2022 06 21.
Artículo en Inglés | MEDLINE | ID: mdl-35727410

RESUMEN

Phylogenetic networks are used in biology to represent evolutionary histories. The class of orchard phylogenetic networks was recently introduced for their computational benefits, without any biological justification. Here, we show that orchard networks can be interpreted as trees with additional horizontal arcs. Therefore, they are closely related to tree-based networks, where the difference is that in tree-based networks the additional arcs do not need to be horizontal. Then, we use this new characterization to show that the space of orchard networks on n leaves with k reticulations is connected under the rNNI rearrangement move with diameter [Formula: see text].


Asunto(s)
Conceptos Matemáticos , Modelos Biológicos , Algoritmos , Evolución Biológica , Modelos Genéticos , Filogenia
3.
J Math Biol ; 83(3): 32, 2021 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-34482446

RESUMEN

Phylogenetic networks can represent evolutionary events that cannot be described by phylogenetic trees. These networks are able to incorporate reticulate evolutionary events such as hybridization, introgression, and lateral gene transfer. Recently, network-based Markov models of DNA sequence evolution have been introduced along with model-based methods for reconstructing phylogenetic networks. For these methods to be consistent, the network parameter needs to be identifiable from data generated under the model. Here, we show that the semi-directed network parameter of a triangle-free, level-1 network model with any fixed number of reticulation vertices is generically identifiable under the Jukes-Cantor, Kimura 2-parameter, or Kimura 3-parameter constraints.


Asunto(s)
Evolución Molecular , Modelos Genéticos , Algoritmos , Hibridación Genética , Cadenas de Markov , Filogenia
4.
Syst Biol ; 67(3): 518-542, 2018 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-29272537

RESUMEN

Phylogenetic networks are well suited to represent evolutionary histories comprising reticulate evolution. Several methods aiming at reconstructing explicit phylogenetic networks have been developed in the last two decades. In this article, we propose a new definition of maximum parsimony for phylogenetic networks that permits to model biological scenarios that cannot be modeled by the definitions currently present in the literature (namely, the "hardwired" and "softwired" parsimony). Building on this new definition, we provide several algorithmic results that lay the foundations for new parsimony-based methods for phylogenetic network reconstruction.


Asunto(s)
Clasificación/métodos , Modelos Biológicos , Filogenia , Algoritmos
5.
Bull Math Biol ; 81(10): 3823-3863, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31297691

RESUMEN

Network reconstruction lies at the heart of phylogenetic research. Two well-studied classes of phylogenetic networks include tree-child networks and level-k networks. In a tree-child network, every non-leaf node has a child that is a tree node or a leaf. In a level-k network, the maximum number of reticulations contained in a biconnected component is k. Here, we show that level-k tree-child networks are encoded by their reticulate-edge-deleted subnetworks, which are subnetworks obtained by deleting a single reticulation edge, if [Formula: see text]. Following this, we provide a polynomial-time algorithm for uniquely reconstructing such networks from their reticulate-edge-deleted subnetworks. Moreover, we show that this can even be done when considering subnetworks obtained by deleting one reticulation edge from each biconnected component with k reticulations.


Asunto(s)
Algoritmos , Filogenia , Biología Computacional , Evolución Molecular , Conceptos Matemáticos , Modelos Genéticos
6.
J Math Biol ; 79(5): 1623-1638, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31363828

RESUMEN

Unrooted phylogenetic networks are graphs used to represent reticulate evolutionary relationships. Accurately reconstructing such networks is of great relevance for evolutionary biology. It has recently been conjectured that all unrooted phylogenetic networks for at least five taxa can be uniquely reconstructed from their subnetworks obtained by deleting a single taxon. Here, we show that this conjecture is false, by presenting a counter-example for each possible number of taxa that is at least 4. Moreover, we show that the conjecture is still false when restricted to binary networks. This means that, even if we are able to reconstruct the unrooted evolutionary history of each proper subset of some taxon set, this still does not give us enough information to reconstruct their full unrooted evolutionary history.


Asunto(s)
Modelos Biológicos , Filogenia , Algoritmos , Evolución Biológica , Evolución Molecular , Conceptos Matemáticos , Modelos Genéticos
7.
J Math Biol ; 78(1-2): 527-547, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30121824

RESUMEN

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.


Asunto(s)
Modelos Genéticos , Filogenia , Algoritmos , Animales , Biología Computacional , Evolución Molecular , Especiación Genética , Humanos , Conceptos Matemáticos
8.
PLoS Comput Biol ; 13(8): e1005611, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28763439

RESUMEN

Phylogenetic tree reconstruction is usually done by local search heuristics that explore the space of the possible tree topologies via simple rearrangements of their structure. Tree rearrangement heuristics have been used in combination with practically all optimization criteria in use, from maximum likelihood and parsimony to distance-based principles, and in a Bayesian context. Their basic components are rearrangement moves that specify all possible ways of generating alternative phylogenies from a given one, and whose fundamental property is to be able to transform, by repeated application, any phylogeny into any other phylogeny. Despite their long tradition in tree-based phylogenetics, very little research has gone into studying similar rearrangement operations for phylogenetic network-that is, phylogenies explicitly representing scenarios that include reticulate events such as hybridization, horizontal gene transfer, population admixture, and recombination. To fill this gap, we propose "horizontal" moves that ensure that every network of a certain complexity can be reached from any other network of the same complexity, and "vertical" moves that ensure reachability between networks of different complexities. When applied to phylogenetic trees, our horizontal moves-named rNNI and rSPR-reduce to the best-known moves on rooted phylogenetic trees, nearest-neighbor interchange and rooted subtree pruning and regrafting. Besides a number of reachability results-separating the contributions of horizontal and vertical moves-we prove that rNNI moves are local versions of rSPR moves, and provide bounds on the sizes of the rNNI neighborhoods. The paper focuses on the most biologically meaningful versions of phylogenetic networks, where edges are oriented and reticulation events clearly identified. Moreover, our rearrangement moves are robust to the fact that networks with higher complexity usually allow a better fit with the data. Our goal is to provide a solid basis for practical phylogenetic network reconstruction.


Asunto(s)
Biología Computacional/métodos , Reordenamiento Génico/genética , Modelos Genéticos , Filogenia , Animales , Hominidae/genética , Humanos
9.
Bull Math Biol ; 80(8): 2177-2208, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29948885

RESUMEN

Popular methods for exploring the space of rooted phylogenetic trees use rearrangement moves such as rooted Nearest Neighbour Interchange (rNNI) and rooted Subtree Prune and Regraft (rSPR). Recently, these moves were generalized to rooted phylogenetic networks, which are a more suitable representation of reticulate evolutionary histories, and it was shown that any two rooted phylogenetic networks of the same complexity are connected by a sequence of either rSPR or rNNI moves. Here, we show that this is possible using only tail moves, which are a restricted version of rSPR moves on networks that are more closely related to rSPR moves on trees. The connectedness still holds even when we restrict to distance-1 tail moves (a localized version of tail moves). Moreover, we give bounds on the number of (distance-1) tail moves necessary to turn one network into another, which in turn yield new bounds for rSPR, rNNI and SPR (i.e. the equivalent of rSPR on unrooted networks). The upper bounds are constructive, meaning that we can actually find a sequence with at most this length for any pair of networks. Finally, we show that finding a shortest sequence of tail or rSPR moves is NP-hard.


Asunto(s)
Evolución Biológica , Modelos Biológicos , Filogenia , Algoritmos , Evolución Molecular , Conceptos Matemáticos , Modelos Genéticos
10.
Mol Biol Evol ; 33(8): 2151-62, 2016 08.
Artículo en Inglés | MEDLINE | ID: mdl-27189565

RESUMEN

Phylogenetic networks are a generalization of evolutionary trees that can be used to represent reticulate processes such as hybridization and recombination. Here, we introduce a new approach called TriLoNet (Trinet Level- one Network algorithm) to construct such networks directly from sequence alignments which works by piecing together smaller phylogenetic networks. More specifically, using a bottom up approach similar to Neighbor-Joining, TriLoNet constructs level-1 networks (networks that are somewhat more general than trees) from smaller level-1 networks on three taxa. In simulations, we show that TriLoNet compares well with Lev1athan, a method for reconstructing level-1 networks from three-leaved trees. In particular, in simulations we find that Lev1athan tends to generate networks that overestimate the number of reticulate events as compared with those generated by TriLoNet. We also illustrate TriLoNet's applicability using simulated and real sequence data involving recombination, demonstrating that it has the potential to reconstruct informative reticulate evolutionary histories. TriLoNet has been implemented in JAVA and is freely available at https://www.uea.ac.uk/computing/TriLoNet.


Asunto(s)
Evolución Biológica , Alineación de Secuencia/métodos , Algoritmos , Simulación por Computador , Evolución Molecular , Redes Reguladoras de Genes/genética , Modelos Genéticos , Filogenia
11.
Bull Math Biol ; 79(5): 1135-1154, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28386669

RESUMEN

Phylogenetic networks are a generalization of evolutionary trees that are used by biologists to represent the evolution of organisms which have undergone reticulate evolution. Essentially, a phylogenetic network is a directed acyclic graph having a unique root in which the leaves are labelled by a given set of species. Recently, some approaches have been developed to construct phylogenetic networks from collections of networks on 2- and 3-leaved networks, which are known as binets and trinets, respectively. Here we study in more depth properties of collections of binets, one of the simplest possible types of networks into which a phylogenetic network can be decomposed. More specifically, we show that if a collection of level-1 binets is compatible with some binary network, then it is also compatible with a binary level-1 network. Our proofs are based on useful structural results concerning lowest stable ancestors in networks. In addition, we show that, although the binets do not determine the topology of the network, they do determine the number of reticulations in the network, which is one of its most important parameters. We also consider algorithmic questions concerning binets. We show that deciding whether an arbitrary set of binets is compatible with some network is at least as hard as the well-known graph isomorphism problem. However, if we restrict to level-1 binets, it is possible to decide in polynomial time whether there exists a binary network that displays all the binets. We also show that to find a network that displays a maximum number of the binets is NP-hard, but that there exists a simple polynomial-time 1/3-approximation algorithm for this problem. It is hoped that these results will eventually assist in the development of new methods for constructing phylogenetic networks from collections of smaller networks.


Asunto(s)
Modelos Biológicos , Filogenia , Algoritmos , Evolución Biológica , Conceptos Matemáticos
12.
Trends Genet ; 29(8): 439-41, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23764187

RESUMEN

Networks allow the investigation of evolutionary relationships that do not fit a tree model. They are becoming a leading tool for describing the evolutionary relationships between organisms, given the comparative complexities among genomes.


Asunto(s)
Evolución Biológica , Modelos Genéticos , Filogenia , Animales , Genoma , Saccharomyces/genética
13.
Syst Biol ; 64(1): 102-11, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25236959

RESUMEN

Phylogenetic networks are a generalization of evolutionary trees and are an important tool for analyzing reticulate evolutionary histories. Recently, there has been great interest in developing new methods to construct rooted phylogenetic networks, that is, networks whose internal vertices correspond to hypothetical ancestors, whose leaves correspond to sampled taxa, and in which vertices with more than one parent correspond to taxa formed by reticulate evolutionary events such as recombination or hybridization. Several methods for constructing evolutionary trees use the strategy of building up a tree from simpler building blocks (such as triplets or clusters), and so it is natural to look for ways to construct networks from smaller networks. In this article, we shall demonstrate a fundamental issue with this approach. Namely, we show that even if we are given all of the subnetworks induced on all proper subsets of the leaves of some rooted phylogenetic network, we still do not have all of the information required to completely determine that network. This implies that even if all of the building blocks for some reticulate evolutionary history were to be taken as the input for any given network building method, the method might still output an incorrect history. We also discuss some potential consequences of this result for constructing phylogenetic networks.


Asunto(s)
Clasificación/métodos , Filogenia , Cryptococcus gattii/clasificación , Cryptococcus gattii/genética , Modelos Teóricos
14.
Bull Math Biol ; 78(9): 1773-1795, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27659024

RESUMEN

Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be explained by a given network. In mathematical terms, this is often translated in the following way: is a given phylogenetic tree contained in a given phylogenetic network? Recently this tree containment problem has been widely investigated from a computational perspective, but most studies have only focused on the topology of the phylogenies, ignoring a piece of information that, in the case of phylogenetic trees, is routinely inferred by evolutionary analyses: branch lengths. These measure the amount of change (e.g., nucleotide substitutions) that has occurred along each branch of the phylogeny. Here, we study a number of versions of the tree containment problem that explicitly account for branch lengths. We show that, although length information has the potential to locate more precisely a tree within a network, the problem is computationally hard in its most general form. On a positive note, for a number of special cases of biological relevance, we provide algorithms that solve this problem efficiently. This includes the case of networks of limited complexity, for which it is possible to recover, among the trees contained by the network with the same topology as the input tree, the closest one in terms of branch lengths.


Asunto(s)
Modelos Genéticos , Filogenia , Algoritmos , Evolución Molecular , Conceptos Matemáticos
15.
BMC Bioinformatics ; 15: 127, 2014 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-24884964

RESUMEN

BACKGROUND: Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. Even for binary trees, exact solvers struggle to solve instances with reticulation number larger than 40-50. RESULTS: Here we present CycleKiller and NonbinaryCycleKiller, the first methods to produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations. CONCLUSIONS: Using simulations, we demonstrate that these algorithms run quickly for large and difficult instances, producing solutions that are very close to optimality. As a spin-off from our simulations we also present TerminusEst, which is the fastest exact method currently available that can handle nonbinary trees: this is used to measure the accuracy of the NonbinaryCycleKiller algorithm. All three methods are based on extensions of previous theoretical work (SIDMA 26(4):1635-1656, TCBB 10(1):18-25, SIDMA 28(1):49-66) and are publicly available. We also apply our methods to real data.


Asunto(s)
Algoritmos , Filogenia , Hibridación Genética , Programas Informáticos
16.
J Math Biol ; 68(7): 1707-29, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23680992

RESUMEN

Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that [Formula: see text] phylogenetic networks are encoded by their trinets, and also conjectured that all "recoverable" rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level-2 networks and binary tree-child networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets.


Asunto(s)
Modelos Genéticos , Filogenia , Algoritmos , Evolución Biológica , Redes Reguladoras de Genes , Especiación Genética , Conceptos Matemáticos
17.
Algorithms Mol Biol ; 18(1): 13, 2023 Sep 16.
Artículo en Inglés | MEDLINE | ID: mdl-37717003

RESUMEN

BACKGROUND: Combining a set of phylogenetic trees into a single phylogenetic network that explains all of them is a fundamental challenge in evolutionary studies. Existing methods are computationally expensive and can either handle only small numbers of phylogenetic trees or are limited to severely restricted classes of networks. RESULTS: In this paper, we apply the recently-introduced theoretical framework of cherry picking to design a class of efficient heuristics that are guaranteed to produce a network containing each of the input trees, for practical-size datasets consisting of binary trees. Some of the heuristics in this framework are based on the design and training of a machine learning model that captures essential information on the structure of the input trees and guides the algorithms towards better solutions. We also propose simple and fast randomised heuristics that prove to be very effective when run multiple times. CONCLUSIONS: Unlike the existing exact methods, our heuristics are applicable to datasets of practical size, and the experimental study we conducted on both simulated and real data shows that these solutions are qualitatively good, always within some small constant factor from the optimum. Moreover, our machine-learned heuristics are one of the first applications of machine learning to phylogenetics and show its promise.

18.
Bioinformatics ; 26(12): i124-31, 2010 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-20529896

RESUMEN

UNLABELLED: Phylogenetic trees are widely used to display estimates of how groups of species are evolved. Each phylogenetic tree can be seen as a collection of clusters, subgroups of the species that evolved from a common ancestor. When phylogenetic trees are obtained for several datasets (e.g. for different genes), then their clusters are often contradicting. Consequently, the set of all clusters of such a dataset cannot be combined into a single phylogenetic tree. Phylogenetic networks are a generalization of phylogenetic trees that can be used to display more complex evolutionary histories, including reticulate events, such as hybridizations, recombinations and horizontal gene transfers. Here, we present the new Cass algorithm that can combine any set of clusters into a phylogenetic network. We show that the networks constructed by Cass are usually simpler than networks constructed by other available methods. Moreover, we show that Cass is guaranteed to produce a network with at most two reticulations per biconnected component, whenever such a network exists. We have implemented Cass and integrated it into the freely available Dendroscope software. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Filogenia , Algoritmos , Análisis por Conglomerados , Evolución Molecular
19.
J Theor Biol ; 269(1): 245-55, 2011 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-21044634

RESUMEN

Rooted phylogenetic networks are used to model non-treelike evolutionary histories. Such networks are often constructed by combining trees, clusters, triplets or characters into a single network that in some well-defined sense simultaneously represents them all. We review these four models and investigate how they are related. Motivated by the parsimony principle, one often aims to construct a network that contains as few reticulations (non-treelike evolutionary events) as possible. In general, the model chosen influences the minimum number of reticulation events required. However, when one obtains the input data from two binary (i.e. fully resolved) trees, we show that the minimum number of reticulations is independent of the model. The number of reticulations necessary to represent the trees, triplets, clusters (in the softwired sense) and characters (with unrestricted multiple crossover recombination) are all equal. Furthermore, we show that these results also hold when not the number of reticulations but the level of the constructed network is minimised. We use these unification results to settle several computational complexity questions that have been open in the field for some time. We also give explicit examples to show that already for data obtained from three binary trees the models begin to diverge.


Asunto(s)
Filogenia , Análisis por Conglomerados , Modelos Genéticos , Recombinación Genética/genética
20.
BMC Ecol Evol ; 21(1): 220, 2021 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-34876022

RESUMEN

BACKGROUND: Rooted phylogenetic networks are used to display complex evolutionary history involving so-called reticulation events, such as genetic recombination. Various methods have been developed to construct such networks, using for example a multiple sequence alignment or multiple phylogenetic trees as input data. Coronaviruses are known to recombine frequently, but rooted phylogenetic networks have not yet been used extensively to describe their evolutionary history. Here, we created a workflow to compare the evolutionary history of SARS-CoV-2 with other SARS-like viruses using several rooted phylogenetic network inference algorithms. This workflow includes filtering noise from sets of phylogenetic trees by contracting edges based on branch length and bootstrap support, followed by resolution of multifurcations. We explored the running times of the network inference algorithms, the impact of filtering on the properties of the produced networks, and attempted to derive biological insights regarding the evolution of SARS-CoV-2 from them. RESULTS: The network inference algorithms are capable of constructing rooted phylogenetic networks for coronavirus data, although running-time limitations require restricting such datasets to a relatively small number of taxa. Filtering generally reduces the number of reticulations in the produced networks and increases their temporal consistency. Taxon bat-SL-CoVZC45 emerges as a major and structural source of discordance in the dataset. The tested algorithms often indicate that SARS-CoV-2/RaTG13 is a tree-like clade, with possibly some reticulate activity further back in their history. A smaller number of constructed networks posit SARS-CoV-2 as a possible recombinant, although this might be a methodological artefact arising from the interaction of bat-SL-CoVZC45 discordance and the optimization criteria used. CONCLUSION: Our results demonstrate that as part of a wider workflow and with careful attention paid to running time, rooted phylogenetic network algorithms are capable of producing plausible networks from coronavirus data. These networks partly corroborate existing theories about SARS-CoV-2, and partly produce new avenues for exploration regarding the location and significance of reticulate activity within the wider group of SARS-like viruses. Our workflow may serve as a model for pipelines in which phylogenetic network algorithms can be used to analyse different datasets and test different hypotheses.


Asunto(s)
COVID-19 , SARS-CoV-2 , Algoritmos , Humanos , Filogenia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA