Pesquisa | Portal Regional da BVS

1.

The Space of Equidistant Phylogenetic Cactuses.

Huber, Katharina T; Moulton, Vincent; Owen, Megan; Spillner, Andreas; St John, Katherine.

Ann Comb ; 28(1): 1-32, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38433929

RESUMO

An equidistant X-cactus is a type of rooted, arc-weighted, directed acyclic graph with leaf set X, that is used in biology to represent the evolutionary history of a set X of species. In this paper, we introduce and investigate the space of equidistant X-cactuses. This space contains, as a subset, the space of ultrametric trees on X that was introduced by Gavryushkin and Drummond. We show that equidistant-cactus space is a CAT(0)-metric space which implies, for example, that there are unique geodesic paths between points. As a key step to proving this, we present a combinatorial result concerning ranked rooted X-cactuses. In particular, we show that such graphs can be encoded in terms of a pairwise compatibility condition arising from a poset of collections of pairs of subsets of X that satisfy certain set-theoretic properties. As a corollary, we also obtain an encoding of ranked, rooted X-trees in terms of partitions of X, which provides an alternative proof that the space of ultrametric trees on X is CAT(0). We expect that our results will provide the basis for novel ways to perform statistical analyses on collections of equidistant X-cactuses, as well as new directions for defining and understanding spaces of more general, arc-weighted phylogenetic networks.

2.

SPECTRE: a suite of phylogenetic tools for reticulate evolution.

Bastkowski, Sarah; Mapleson, Daniel; Spillner, Andreas; Wu, Taoyang; Balvociute, Monika; Moulton, Vincent.

Bioinformatics ; 34(6): 1056-1057, 2018 03 15.

Artigo em Inglês | MEDLINE | ID: mdl-29186450

RESUMO

Summary: Split-networks are a generalization of phylogenetic trees that have proven to be a powerful tool in phylogenetics. Various ways have been developed for computing such networks, including split-decomposition, NeighborNet, QNet and FlatNJ. Some of these approaches are implemented in the user-friendly SplitsTree software package. However, to give the user the option to adjust and extend these approaches and to facilitate their integration into analysis pipelines, there is a need for robust, open-source implementations of associated data structures and algorithms. Here, we present SPECTRE, a readily available, open-source library of data structures written in Java, that comes complete with new implementations of several pre-published algorithms and a basic interactive graphical interface for visualizing planar split networks. SPECTRE also supports the use of longer running algorithms by providing command line interfaces, which can be executed on servers or in High Performance Computing environments. Availability and implementation: Full source code is available under the GPLv3 license at: https://github.com/maplesond/SPECTRE. SPECTRE's core library is available from Maven Central at: https://mvnrepository.com/artifact/uk.ac.uea.cmp.spectre/core. Documentation is available at: http://spectre-suite-of-phylogenetic-tools-for-reticulate-evolution.readthedocs.io/en/latest/. Contact: sarah.bastkowski@earlham.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Filogenia , Algoritmos , Biblioteca Gênica , Software

3.

The minimum evolution problem is hard: a link between tree inference and graph clustering problems.

Bastkowski, Sarah; Moulton, Vincent; Spillner, Andreas; Wu, Taoyang.

Bioinformatics ; 32(4): 518-22, 2016 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-26500153

RESUMO

MOTIVATION: Distance methods are well suited for constructing massive phylogenetic trees. However, the computational complexity for Rzhetsky and Nei's minimum evolution (ME) approach, one of the earliest methods for constructing a phylogenetic tree from a distance matrix, remains open. RESULTS: We show that Rzhetsky and Nei's ME problem is NP-complete, and so probably computationally intractable. We do this by linking the ME problem to a graph clustering problem called the quasi-clique decomposition problem, which has recently also been shown to be NP-complete. We also discuss how this link could potentially open up some useful new connections between phylogenetics and graph clustering.

Assuntos

Algoritmos , Evolução Biológica , Modelos Teóricos , Filogenia , Análise por Conglomerados , Simulação por Computador , Humanos , Software

4.

Erratum to: Neighborhoods of trees in circular orderings.

Bastkowski, Sarah; Moulton, Vincent; Spillner, Andreas; Wu, Taoyang.

Bull Math Biol ; 77(3): 579, 2015 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25753151

5.

Neighborhoods of trees in circular orderings.

Bastkowski, Sarah; Baskowski, Sarah; Moulton, Vincent; Spillner, Andreas; Wu, Taoyang.

Bull Math Biol ; 77(1): 46-70, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25477080

RESUMO

In phylogenetics, a common strategy used to construct an evolutionary tree for a set of species [Formula: see text] is to search in the space of all such trees for one that optimizes some given score function (such as the minimum evolution, parsimony or likelihood score). As this can be computationally intensive, it was recently proposed to restrict such searches to the set of all those trees that are compatible with some circular ordering of the set [Formula: see text]. To inform the design of efficient algorithms to perform such searches, it is therefore of interest to find bounds for the number of trees compatible with a fixed ordering in the neighborhood of a tree that is determined by certain tree operations commonly used to search for trees: the nearest neighbor interchange (NNI), the subtree prune and regraft (SPR) and the tree bisection and reconnection (TBR) operations. We show that the size of such a neighborhood of a binary tree associated with the NNI operation is independent of the tree's topology, but that this is not the case for the SPR and TBR operations. We also give tight upper and lower bounds for the size of the neighborhood of a binary tree for the SPR and TBR operations and characterize those trees for which these bounds are attained.

Assuntos

Modelos Genéticos , Filogenia , Algoritmos , Evolução Molecular , Funções Verossimilhança , Conceitos Matemáticos

6.

Prioritizing populations for conservation using phylogenetic networks.

Volkmann, Logan; Martyn, Iain; Moulton, Vincent; Spillner, Andreas; Mooers, Arne O.

PLoS One ; 9(2): e88945, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24586451

RESUMO

In the face of inevitable future losses to biodiversity, ranking species by conservation priority seems more than prudent. Setting conservation priorities within species (i.e., at the population level) may be critical as species ranges become fragmented and connectivity declines. However, existing approaches to prioritization (e.g., scoring organisms by their expected genetic contribution) are based on phylogenetic trees, which may be poor representations of differentiation below the species level. In this paper we extend evolutionary isolation indices used in conservation planning from phylogenetic trees to phylogenetic networks. Such networks better represent population differentiation, and our extension allows populations to be ranked in order of their expected contribution to the set. We illustrate the approach using data from two imperiled species: the spotted owl Strix occidentalis in North America and the mountain pygmy-possum Burramys parvus in Australia. Using previously published mitochondrial and microsatellite data, we construct phylogenetic networks and score each population by its relative genetic distinctiveness. In both cases, our phylogenetic networks capture the geographic structure of each species: geographically peripheral populations harbor less-redundant genetic information, increasing their conservation rankings. We note that our approach can be used with all conservation-relevant distances (e.g., those based on whole-genome, ecological, or adaptive variation) and suggest it be added to the assortment of tools available to wildlife managers for allocating effort among threatened populations.

Assuntos

Biodiversidade , Filogenia , Animais , Austrália , Conservação dos Recursos Naturais

7.

FlatNJ: a novel network-based approach to visualize evolutionary and biogeographical relationships.

Balvocute, Monika; Spillner, Andreas; Moulton, Vincent.

Syst Biol ; 63(3): 383-96, 2014 May.

Artigo em Inglês | MEDLINE | ID: mdl-24436254

RESUMO

Split networks are a type of phylogenetic network that allow visualization of conflict in evolutionary data. We present a new method for constructing such networks called FlatNetJoining (FlatNJ). A key feature of FlatNJ is that it produces networks that can be drawn in the plane in which labels may appear inside of the network. For complex data sets that involve, for example, non-neutral molecular markers, this can allow additional detail to be visualized as compared to previous methods such as split decomposition and NeighborNet. We illustrate the application of FlatNJ by applying it to whole HIV genome sequences, where recombination has taken place, fluorescent proteins in corals, where ancestral sequences are present, and mitochondrial DNA sequences from gall wasps, where biogeographical relationships are of interest. We find that the networks generated by FlatNJ can facilitate the study of genetic variation in the underlying molecular sequence data and, in particular, may help to investigate processes such as intra-locus recombination. FlatNJ has been implemented in Java and is freely available at www.uea.ac.uk/computing/software/flatnj.

Assuntos

Classificação/métodos , Filogeografia/métodos , Animais , HIV/classificação , HIV/genética , Filogenia , Estatística como Assunto , Vespas/classificação , Vespas/genética

8.

SuperQ: computing supernetworks from quartets.

Grünewald, Stefan; Spillner, Andreas; Bastkowski, Sarah; Bögershausen, Anja; Moulton, Vincent.

IEEE/ACM Trans Comput Biol Bioinform ; 10(1): 151-60, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23702551

RESUMO

Supertrees are a commonly used tool in phylogenetics to summarize collections of partial phylogenetic trees. As a generalization of supertrees, phylogenetic supernetworks allow, in addition, the visual representation of conflict between the trees that is not possible to observe with a single tree. Here, we introduce SuperQ, a new method for constructing such supernetworks (SuperQ is freely available at >www.uea.ac.uk/computing/superq.). It works by first breaking the input trees into quartet trees, and then stitching these together to form a special kind of phylogenetic network, called a split network. This stitching process is performed using an adaptation of the QNet method for split network reconstruction employing a novel approach to use the branch lengths from the input trees to estimate the branch lengths in the resulting network. Compared with previous supernetwork methods, SuperQ has the advantage of producing a planar network. We compare the performance of SuperQ to the Z-closure and Q-imputation supernetwork methods, and also present an analysis of some published data sets as an illustration of its applicability.

Assuntos

Biologia Computacional/métodos , Filogenia , Software , Bases de Dados Genéticas , Genes Fúngicos , Genes de Plantas , Modelos Genéticos

9.

Computing evolutionary distinctiveness indices in large scale analysis.

Martyn, Iain; Kuhn, Tyler S; Mooers, Arne O; Moulton, Vincent; Spillner, Andreas.

Algorithms Mol Biol ; 7: 6, 2012 Apr 13.

Artigo em Inglês | MEDLINE | ID: mdl-22502588

RESUMO

We present optimal linear time algorithms for computing the Shapley values and 'heightened evolutionary distinctiveness' (HED) scores for the set of taxa in a phylogenetic tree. We demonstrate the efficiency of these new algorithms by applying them to a set of 10,000 reasonable 5139-species mammal trees. This is the first time these indices have been computed on such a large taxon and we contrast our finding with an ad-hoc index for mammals, fair proportion (FP), used by the Zoological Society of London's EDGE programme. Our empirical results follow expectations. In particular, the Shapley values are very strongly correlated with the FP scores, but provide a higher weight to the few monotremes that comprise the sister to all other mammals. We also find that the HED score, which measures a species' unique contribution to future subsets as function of the probability that close relatives will go extinct, is very sensitive to the estimated probabilities. When they are low, HED scores are less than FP scores, and approach the simple measure of a species' age. Deviations (like the Solendon genus of the West Indies) occur when sister species are both at high risk of extinction and their clade roots deep in the tree. Conversely, when endangered species have higher probabilities of being lost, HED scores can be greater than FP scores and species like the African elephant Loxondonta africana, the two solendons and the thumbless bat Furipterus horrens can move up the rankings. We suggest that conservation attention be applied to such species that carry genetic responsibility for imperiled close relatives. We also briefly discuss extensions of Shapley values and HED scores that are possible with the algorithms presented here.

10.

Constructing and drawing regular planar split networks.

Spillner, Andreas; Nguyen, Binh T; Moulton, Vincent.

IEEE/ACM Trans Comput Biol Bioinform ; 9(2): 395-407, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-21844634

RESUMO

Split networks are commonly used to visualize collections of bipartitions, also called splits, of a finite set. Such collections arise, for example, in evolutionary studies. Split networks can be viewed as a generalization of phylogenetic trees and may be generated using the SplitsTree package. Recently, the NeighborNet method for generating split networks has become rather popular, in part because it is guaranteed to always generate a circular split system, which can always be displayed by a planar split network. Even so, labels must be placed on the "outside" of the network, which might be problematic in some applications. To help circumvent this problem, it can be helpful to consider so-called flat split systems, which can be displayed by planar split networks where labels are allowed on the inside of the network too. Here, we present a new algorithm that is guaranteed to compute a minimal planar split network displaying a flat split system in polynomial time, provided the split system is given in a certain format. We will also briefly discuss two heuristics that could be useful for analyzing phylogeographic data and that allow the computation of flat split systems in this format in polynomial time.

Assuntos

Algoritmos , Biologia Computacional/métodos , Filogenia , Modelos Genéticos

11.

Metrics on multilabeled trees: interrelationships and diameter bounds.

Huber, Katharina T; Spillner, Andreas; Suchecki, Radoslaw; Moulton, Vincent.

IEEE/ACM Trans Comput Biol Bioinform ; 8(4): 1029-40, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-21116046

RESUMO

Multilabeled trees or MUL-trees, for short, are trees whose leaves are labeled by elements of some nonempty finite set X such that more than one leaf may be labeled by the same element of X. This class of trees includes phylogenetic trees and tree shapes. MUL-trees arise naturally in, for example, biogeography and gene evolution studies and also in the area of phylogenetic network reconstruction. In this paper, we introduce novel metrics which may be used to compare MUL-trees, most of which generalize well-known metrics on phylogenetic trees and tree shapes. These metrics can be used, for example, to better understand the space of MUL-trees or to help visualize collections of MUL-trees. In addition, we describe some relationships between the MUL-tree metrics that we present and also give some novel diameter bounds for these metrics. We conclude by briefly discussing some open problems as well as pointing out how MUL-tree metrics may be used to define metrics on the space of phylogenetic networks.

Assuntos

Algoritmos , Análise por Conglomerados , Biologia Computacional/métodos , Modelos Genéticos , Filogenia , Evolução Biológica , Bases de Dados Genéticas

12.

Distinguishing between hot-spots and melting-pots of genetic diversity using haplotype connectivity.

Nguyen, Binh; Spillner, Andreas; Emerson, Brent C; Moulton, Vincent.

Algorithms Mol Biol ; 5: 19, 2010 Mar 20.

Artigo em Inglês | MEDLINE | ID: mdl-20302665

RESUMO

We introduce a method to help identify how the genetic diversity of a species within a geographic region might have arisen. This problem appears, for example, in the context of identifying refugia in phylogeography, and in the conservation of biodiversity where it is a factor in nature reserve selection. Complementing current methods for measuring genetic diversity, we analyze pairwise distances between the haplotypes of a species found in a geographic region and derive a quantity, called haplotype connectivity, that aims to capture how divergent the haplotypes are relative to one another. We propose using haplotype connectivity to indicate whether, for geographic regions that harbor a highly diverse collection of haplotypes, diversity evolved inside a region over a long period of time (a "hot-spot") or is the result of a more recent mixture (a "melting-pot"). We describe how the haplotype connectivity for a collection of haplotypes can be computed efficiently and briefly discuss some related optimization problems that arise in this context. We illustrate the applicability of our method using two previously published data sets of a species of beetle from the genus Brachyderes and a species of tree from the genus Pinus.

13.

Inferring polyploid phylogenies from multiply-labeled gene trees.

Lott, Martin; Spillner, Andreas; Huber, Katharina T; Petri, Anna; Oxelman, Bengt; Moulton, Vincent.

BMC Evol Biol ; 9: 216, 2009 Aug 28.

Artigo em Inglês | MEDLINE | ID: mdl-19715596

RESUMO

BACKGROUND: Gene trees that arise in the context of reconstructing the evolutionary history of polyploid species are often multiply-labeled, that is, the same leaf label can occur several times in a single tree. This property considerably complicates the task of forming a consensus of a collection of such trees compared to usual phylogenetic trees. RESULTS: We present a method for computing a consensus tree of multiply-labeled trees. As with the well-known greedy consensus tree approach for phylogenetic trees, our method first breaks the given collection of gene trees into a set of clusters. It then aims to insert these clusters one at a time into a tree, starting with the clusters that are supported by most of the gene trees. As the problem to decide whether a cluster can be inserted into a multiply-labeled tree is computationally hard, we have developed a heuristic method for solving this problem. CONCLUSION: We illustrate the applicability of our method using two collections of trees for plants of the genus Silene, that involve several allopolyploids at different levels.

Assuntos

Filogenia , Poliploidia , Algoritmos , Animais , Plantas/classificação , Plantas/genética

14.

PADRE: a package for analyzing and displaying reticulate evolution.

Lott, Martin; Spillner, Andreas; Huber, Katharina T; Moulton, Vincent.

Bioinformatics ; 25(9): 1199-200, 2009 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-19269989

RESUMO

UNLABELLED: Recent advances in gene sequencing for polyploid species, coupled with standard phylogenetic tree reconstruction, leads to gene trees in which the same species can label several leaves. Such multi-labeled trees are then used to reconstruct the evolutionary history of the polyploid species in question. However, this reconstruction process requires new techniques that are not available in current phylogenetic software packages. Here, we describe the software package PADRE (Package for Analyzing and Displaying Reticulate Evolution) that implements such techniques, allowing the reconstruction of complex evolutionary histories for polyploids in the form of phylogenetic networks. AVAILABILITY: PADRE is an open-source Java program freely available from http://www.uea.ac.uk/cmp/research/cmpbio/PADRE.

Assuntos

Biologia Computacional/métodos , Filogenia , Software , Evolução Molecular , Redes Reguladoras de Genes , Interface Usuário-Computador

15.

The complexity of deriving multi-labeled trees from bipartitions.

Huber, Katharina T; Lott, Martin; Moulton, Vincent; Spillner, Andreas.

J Comput Biol ; 15(6): 639-51, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18631026

RESUMO

Recently, multi-labeled trees have been used to help unravel the evolutionary origins of polyploid species. A multi-labeled tree is the same as a phylogenetic tree except that more than one leaf may be labeled by a single species, so that the leaf set of a multi-labeled tree can be regarded as a multiset. In contrast to phylogenetic trees, which can be efficiently encoded in terms of certain bipartitions of their leaf sets, we show that it is NP-hard to decide whether a collection of bipartitions of a multiset can be represented by a multi-labeled tree. Even so, we also show that it is possible to generalize to multi-labeled trees a well-known condition that characterizes when a collection of bipartitions encodes a phylogenetic tree. Using this generalization, we obtain a fixed-parameter algorithm for the above decision problem in terms of a parameter associated to the given multiset.

Assuntos

Evolução Molecular , Modelos Genéticos , Filogenia , Poliploidia

16.

Computing phylogenetic diversity for split systems.

Spillner, Andreas; Nguyen, Binh T; Moulton, Vincent.

IEEE/ACM Trans Comput Biol Bioinform ; 5(2): 235-44, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18451432

RESUMO

In conservation biology it is a central problem to measure, predict, and preserve biodiversity as species face extinction. In 1992 Faith proposed measuring the diversity of a collection of species in terms of their relationships on a phylogenetic tree, and to use this information to identify collections of species with high diversity. Here we are interested in some variants of the resulting optimization problem that arise when considering species whose evolution is better represented by a network rather than a tree. More specifically, we consider the problem of computing phylogenetic diversity relative to a split system on a collection of species of size n. We show that for general split systems this problem is NP-hard. In addition we provide some efficient algorithms for some special classes of split systems, in particular presenting an optimal O(n) time algorithm for phylogenetic trees and an O(n log n + nk) time algorithm for choosing an optimal subset of size k relative to a circular split system.

Assuntos

Biodiversidade , Variação Genética , Modelos Genéticos , Filogenia , Algoritmos , Biologia Computacional , Simulação por Computador , Matemática

17.

Consistency of the neighbor-net algorithm.

Bryant, David; Moulton, Vincent; Spillner, Andreas.

Algorithms Mol Biol ; 2: 8, 2007 Jun 28.

Artigo em Inglês | MEDLINE | ID: mdl-17597551

RESUMO

BACKGROUND: Neighbor-Net is a novel method for phylogenetic analysis that is currently being widely used in areas such as virology, bacteriology, and plant evolution. Given an input distance matrix, Neighbor-Net produces a phylogenetic network, a generalization of an evolutionary or phylogenetic tree which allows the graphical representation of conflicting phylogenetic signals. RESULTS: In general, any network construction method should not depict more conflict than is found in the data, and, when the data is fitted well by a tree, the method should return a network that is close to this tree. In this paper we provide a formal proof that Neighbor-Net satisfies both of these requirements so that, in particular, Neighbor-Net is statistically consistent on circular distances.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA