RESUMO
Tree shape statistics based on peripheral structures have been utilized to study evolutionary mechanisms and inference methods. Partially motivated by a recent study by Pouryahya and Sankoff on modeling the accumulation of subgenomes in the evolution of polyploids, we present the distribution of subtree patterns with four or fewer leaves for the unrooted Proportional to Distinguishable Arrangements (PDA) model. We derive a recursive formula for computing the joint distributions, as well as a Strong Law of Large Numbers and a Central Limit Theorem for the joint distributions. This enables us to confirm several conjectures proposed by Pouryahya and Sankoff, as well as provide some theoretical insights into their observations. Based on their empirical datasets, we demonstrate that the statistical test based on the joint distribution could be more sensitive than those based on one individual subtree pattern to detect the existence of evolutionary forces such as whole genome duplication.
Assuntos
Algoritmos , Modelos Genéticos , FilogeniaRESUMO
Calcium signalling is central to many plant processes, with families of calcium decoder proteins having expanded across the green lineage and redundancy existing between decoders. The liverwort Marchantia polymorpha has fast become a new model plant, but the calcium decoders that exist in this species remain unclear. We performed phylogenetic analyses to identify the calcineurin B-like (CBL) and CBL-interacting protein kinase (CIPK) network of M. polymorpha. We analysed CBL-CIPK expression during salt stress, and determined protein-protein interactions using yeast two-hybrid and bimolecular fluorescence complementation. We also created genetic knockouts using CRISPR/Cas9. We confirm that M. polymorpha has two CIPKs and three CBLs. Both CIPKs and one CBL show pronounced salt-responsive transcriptional changes. All M. polymorpha CBL-CIPKs interact with each other in planta. Knocking out CIPK-B causes increased sensitivity to salt, suggesting that this CIPK is involved in salt signalling. We have identified CBL-CIPKs that form part of a salt tolerance pathway in M. polymorpha. Phylogeny and interaction studies imply that these CBL-CIPKs form an evolutionarily conserved salt overly sensitive pathway. Hence, salt responses may be some of the early functions of CBL-CIPK networks and increased abiotic stress tolerance required for land plant emergence.
Assuntos
Marchantia , Marchantia/genética , Marchantia/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Filogenia , Cálcio/metabolismo , Tolerância ao Sal/genética , Estresse Fisiológico/genética , Proteínas de Ligação ao Cálcio/metabolismoRESUMO
Distributional properties of tree shape statistics under random phylogenetic tree models play an important role in investigating the evolutionary forces underlying the observed phylogenies. In this paper, we study two subtree counting statistics, the number of cherries and that of pitchforks for the Ford model, the alpha model introduced by Daniel Ford. It is a one-parameter family of random phylogenetic tree models which includes the proportional to distinguishable arrangement (PDA) and the Yule models, two tree models commonly used in phylogenetics. Based on a non-uniform version of the extended Pólya urn models in which negative entries are permitted for their replacement matrices, we obtain the strong law of large numbers and the central limit theorem for the joint distribution of these two statistics for the Ford model. Furthermore, we derive a recursive formula for computing the exact joint distribution of these two statistics. This leads to exact formulas for their means and higher order asymptotic expansions of their second moments, which allows us to identify a critical parameter value for the correlation between these two statistics. That is, when the number of tree leaves is sufficiently large, they are negatively correlated for 0≤α≤1/2 and positively correlated for 1/2<α<1.
Assuntos
Evolução Biológica , Modelos Genéticos , FilogeniaRESUMO
The Southern Ocean houses a diverse and productive community of organisms. Unicellular eukaryotic diatoms are the main primary producers in this environment, where photosynthesis is limited by low concentrations of dissolved iron and large seasonal fluctuations in light, temperature and the extent of sea ice. How diatoms have adapted to this extreme environment is largely unknown. Here we present insights into the genome evolution of a cold-adapted diatom from the Southern Ocean, Fragilariopsis cylindrus, based on a comparison with temperate diatoms. We find that approximately 24.7 per cent of the diploid F. cylindrus genome consists of genetic loci with alleles that are highly divergent (15.1 megabases of the total genome size of 61.1 megabases). These divergent alleles were differentially expressed across environmental conditions, including darkness, low iron, freezing, elevated temperature and increased CO2. Alleles with the largest ratio of non-synonymous to synonymous nucleotide substitutions also show the most pronounced condition-dependent expression, suggesting a correlation between diversifying selection and allelic differentiation. Divergent alleles may be involved in adaptation to environmental fluctuations in the Southern Ocean.
Assuntos
Aclimatação/genética , Temperatura Baixa , Diatomáceas/genética , Evolução Molecular , Genoma/genética , Genômica , Alelos , Dióxido de Carbono/metabolismo , Escuridão , Diatomáceas/metabolismo , Congelamento , Perfilação da Expressão Gênica , Deriva Genética , Camada de Gelo , Ferro/metabolismo , Taxa de Mutação , Oceanos e Mares , Filogenia , Recombinação Genética , Transcriptoma/genéticaRESUMO
Tree shape statistics provide valuable quantitative insights into evolutionary mechanisms underpinning phylogenetic trees, a commonly used graph representation of evolutionary relationships among taxonomic units ranging from viruses to species. We study two subtree counting statistics, the number of cherries and the number of pitchforks, for random phylogenetic trees generated by two widely used null tree models: the proportional to distinguishable arrangements (PDA) and the Yule-Harding-Kingman (YHK) models. By developing limit theorems for a version of extended Pólya urn models in which negative entries are permitted for their replacement matrices, we deduce the strong laws of large numbers and the central limit theorems for the joint distributions of these two counting statistics for the PDA and the YHK models. Our results indicate that the limiting behaviour of these two statistics, when appropriately scaled using the number of leaves in the underlying trees, is independent of the initial tree used in the tree generating process.
Assuntos
Evolução Biológica , Folhas de Planta , Modelos Genéticos , FilogeniaRESUMO
Tree shape statistics are important for investigating evolutionary mechanisms mediating phylogenetic trees. As a step towards bridging shape statistics between rooted and unrooted trees, we present a comparison study on two subtree statistics known as numbers of cherries and pitchforks for the proportional to distinguishable arrangements (PDA) and the Yule-Harding-Kingman (YHK) models. Based on recursive formulas on the joint distribution of the number of cherries and that of pitchforks, it is shown that cherry distributions are log-concave for both rooted and unrooted trees under these two models. Furthermore, the mean number of cherries and that of pitchforks for unrooted trees converge respectively to those for rooted trees under the YHK model while there exists a limiting gap of 1∕4 for the PDA model. Finally, the total variation distances between the cherry distributions of rooted and those of unrooted trees converge for both models. Our results indicate that caution is required for conducting statistical analysis for tree shapes involving both rooted and unrooted trees.
Assuntos
Evolução Biológica , Modelos Genéticos , Algoritmos , FilogeniaRESUMO
Rearrangements are discrete processes whereby discrete segments of DNA are deleted, replicated and inserted into novel positions. A sequence of such configurations, termed a rearrangement evolution, results in jumbled DNA arrangements, frequently observed in cancer genomes. We introduce a method that allows us to precisely count these different evolutions for a range of processes including breakage-fusion-bridge-cycles, tandem-duplications, inverted-duplications, reversals, transpositions and deletions, showing that the space of rearrangement evolution is super-exponential in size. These counts assume the infinite sites model of unique breakpoint usage.
Assuntos
DNA , Genoma , Rearranjo Gênico/genética , Genoma/genéticaRESUMO
In order to reduce the pollutant emission and alleviate the pressure of petroleum resources shortage and greenhouse gas emission at the same time, the use of clean and renewable alternative fuel for marine engines is a promising option. In this study, a marine diesel engine, which was modified to run in diesel methanol compound combustion (DMCC) mode, was investigated. After the diesel injection parameters were calibrated, and combined with a sample after-treatment device DOC (diesel oxidation catalyst), the engine could meet the requirements of China II legislation. The overall MSP (methanol substitute percent) reached 54.1%. The value of each pollutant emission was much lower than that in China II emission legislation, and there was almost no methanol and formaldehyde emissions. When methanol was injected into the inlet manifold, the intake air temperature decreased a lot, as well as the exhaust gas temperature, which were beneficial to increase engine thermal efficiency and improve engine room environment. Compared with the engine running in pure diesel mode, when the engine ran in diesel/methanol dual fuel mode, the combustion phase was advanced, and the combustion duration became shorter. Therefore, the engine thermal efficiency increased, and fuel consumption decreased significantly.
Assuntos
Gases de Efeito Estufa , Metanol , Biocombustíveis/análise , China , Gasolina/análise , Emissões de Veículos/análiseRESUMO
Summary: Split-networks are a generalization of phylogenetic trees that have proven to be a powerful tool in phylogenetics. Various ways have been developed for computing such networks, including split-decomposition, NeighborNet, QNet and FlatNJ. Some of these approaches are implemented in the user-friendly SplitsTree software package. However, to give the user the option to adjust and extend these approaches and to facilitate their integration into analysis pipelines, there is a need for robust, open-source implementations of associated data structures and algorithms. Here, we present SPECTRE, a readily available, open-source library of data structures written in Java, that comes complete with new implementations of several pre-published algorithms and a basic interactive graphical interface for visualizing planar split networks. SPECTRE also supports the use of longer running algorithms by providing command line interfaces, which can be executed on servers or in High Performance Computing environments. Availability and implementation: Full source code is available under the GPLv3 license at: https://github.com/maplesond/SPECTRE. SPECTRE's core library is available from Maven Central at: https://mvnrepository.com/artifact/uk.ac.uea.cmp.spectre/core. Documentation is available at: http://spectre-suite-of-phylogenetic-tools-for-reticulate-evolution.readthedocs.io/en/latest/. Contact: sarah.bastkowski@earlham.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Filogenia , Algoritmos , Biblioteca Gênica , SoftwareRESUMO
An important problem in phylogenetics is the construction of phylogenetic trees. One way to approach this problem, known as the supertree method, involves inferring a phylogenetic tree with leaves consisting of a set X of species from a collection of trees, each having leaf-set some subset of X. In the 1980s, Colonius and Schulze gave certain inference rules for deciding when a collection of 4-leaved trees, one for each 4-element subset of X, can be simultaneously displayed by a single supertree with leaf-set X. Recently, it has become of interest to extend this and related results to phylogenetic networks. These are a generalization of phylogenetic trees which can be used to represent reticulate evolution (where species can come together to form a new species). It has recently been shown that a certain type of phylogenetic network, called a (unrooted) level-1 network, can essentially be constructed from 4-leaved trees. However, the problem of providing appropriate inference rules for such networks remains unresolved. Here, we show that by considering 4-leaved networks, called quarnets, as opposed to 4-leaved trees, it is possible to provide such rules. In particular, we show that these rules can be used to characterize when a collection of quarnets, one for each 4-element subset of X, can all be simultaneously displayed by a level-1 network with leaf-set X. The rules are an intriguing mixture of tree inference rules, and an inference rule for building up a cyclic ordering of X from orderings on subsets of X of size 4. This opens up several new directions of research for inferring phylogenetic networks from smaller ones, which could yield new algorithms for solving the supernetwork problem in phylogenetics.
Assuntos
Modelos Biológicos , Filogenia , Evolução Biológica , Especiação Genética , Conceitos MatemáticosRESUMO
Phylogenetic networks are a generalization of phylogenetic trees that allow for representation of reticulate evolution. Recently, a space of unrooted phylogenetic networks was introduced, where such a network is a connected graph in which every vertex has degree 1 or 3 and whose leaf-set is a fixed set X of taxa. This space, denoted [Formula: see text], is defined in terms of two operations on networks-the nearest neighbor interchange and triangle operations-which can be used to transform any network with leaf set X into any other network with that leaf set. In particular, it gives rise to a metric d on [Formula: see text] which is given by the smallest number of operations required to transform one network in [Formula: see text] into another in [Formula: see text]. The metric generalizes the well-known NNI-metric on phylogenetic trees which has been intensively studied in the literature. In this paper, we derive a bound for the metric d as well as a related metric [Formula: see text] which arises when restricting d to the subset of [Formula: see text] consisting of all networks with [Formula: see text] vertices, [Formula: see text]. We also introduce two new metrics on networks-the SPR and TBR metrics-which generalize the metrics on phylogenetic trees with the same name and give bounds for these new metrics. We expect our results to eventually have applications to the development and understanding of network search algorithms.
Assuntos
Evolução Biológica , Modelos Biológicos , Filogenia , Algoritmos , Conceitos MatemáticosRESUMO
Phylogenetic networks are a generalization of evolutionary trees that can be used to represent reticulate processes such as hybridization and recombination. Here, we introduce a new approach called TriLoNet (Trinet Level- one Network algorithm) to construct such networks directly from sequence alignments which works by piecing together smaller phylogenetic networks. More specifically, using a bottom up approach similar to Neighbor-Joining, TriLoNet constructs level-1 networks (networks that are somewhat more general than trees) from smaller level-1 networks on three taxa. In simulations, we show that TriLoNet compares well with Lev1athan, a method for reconstructing level-1 networks from three-leaved trees. In particular, in simulations we find that Lev1athan tends to generate networks that overestimate the number of reticulate events as compared with those generated by TriLoNet. We also illustrate TriLoNet's applicability using simulated and real sequence data involving recombination, demonstrating that it has the potential to reconstruct informative reticulate evolutionary histories. TriLoNet has been implemented in JAVA and is freely available at https://www.uea.ac.uk/computing/TriLoNet.
Assuntos
Evolução Biológica , Alinhamento de Sequência/métodos , Algoritmos , Simulação por Computador , Evolução Molecular , Redes Reguladoras de Genes/genética , Modelos Genéticos , FilogeniaRESUMO
MOTIVATION: Distance methods are well suited for constructing massive phylogenetic trees. However, the computational complexity for Rzhetsky and Nei's minimum evolution (ME) approach, one of the earliest methods for constructing a phylogenetic tree from a distance matrix, remains open. RESULTS: We show that Rzhetsky and Nei's ME problem is NP-complete, and so probably computationally intractable. We do this by linking the ME problem to a graph clustering problem called the quasi-clique decomposition problem, which has recently also been shown to be NP-complete. We also discuss how this link could potentially open up some useful new connections between phylogenetics and graph clustering.
Assuntos
Algoritmos , Evolução Biológica , Modelos Teóricos , Filogenia , Análise por Conglomerados , Simulação por Computador , Humanos , SoftwareRESUMO
Phylogenetic networks are a generalization of evolutionary trees that are used by biologists to represent the evolution of organisms which have undergone reticulate evolution. Essentially, a phylogenetic network is a directed acyclic graph having a unique root in which the leaves are labelled by a given set of species. Recently, some approaches have been developed to construct phylogenetic networks from collections of networks on 2- and 3-leaved networks, which are known as binets and trinets, respectively. Here we study in more depth properties of collections of binets, one of the simplest possible types of networks into which a phylogenetic network can be decomposed. More specifically, we show that if a collection of level-1 binets is compatible with some binary network, then it is also compatible with a binary level-1 network. Our proofs are based on useful structural results concerning lowest stable ancestors in networks. In addition, we show that, although the binets do not determine the topology of the network, they do determine the number of reticulations in the network, which is one of its most important parameters. We also consider algorithmic questions concerning binets. We show that deciding whether an arbitrary set of binets is compatible with some network is at least as hard as the well-known graph isomorphism problem. However, if we restrict to level-1 binets, it is possible to decide in polynomial time whether there exists a binary network that displays all the binets. We also show that to find a network that displays a maximum number of the binets is NP-hard, but that there exists a simple polynomial-time 1/3-approximation algorithm for this problem. It is hoped that these results will eventually assist in the development of new methods for constructing phylogenetic networks from collections of smaller networks.
Assuntos
Modelos Biológicos , Filogenia , Algoritmos , Evolução Biológica , Conceitos MatemáticosRESUMO
In population and evolutionary biology, hypotheses about micro-evolutionary and macro-evolutionary processes are commonly tested by comparing the shape indices of empirical evolutionary trees with those predicted by neutral models. A key ingredient in this approach is the ability to compute and quantify distributions of various tree shape indices under random models of interest. As a step to meet this challenge, in this paper we investigate the joint distribution of cherries and pitchforks (that is, subtrees with two and three leaves) under two widely used null models: the Yule-Harding-Kingman (YHK) model and the proportional to distinguishable arrangements (PDA) model. Based on two novel recursive formulae, we propose a dynamic approach to numerically compute the exact joint distribution (and hence the marginal distributions) for trees of any size. We also obtained insights into the statistical properties of trees generated under these two models, including a constant correlation between the cherry and the pitchfork distributions under the YHK model, and the log-concavity and unimodality of the cherry distributions under both models. In addition, we show that there exists a unique change point for the cherry distributions between these two models.
Assuntos
Evolução Biológica , Modelos Biológicos , Filogenia , HumanosRESUMO
Phylogenetic networks are a generalization of evolutionary trees and are an important tool for analyzing reticulate evolutionary histories. Recently, there has been great interest in developing new methods to construct rooted phylogenetic networks, that is, networks whose internal vertices correspond to hypothetical ancestors, whose leaves correspond to sampled taxa, and in which vertices with more than one parent correspond to taxa formed by reticulate evolutionary events such as recombination or hybridization. Several methods for constructing evolutionary trees use the strategy of building up a tree from simpler building blocks (such as triplets or clusters), and so it is natural to look for ways to construct networks from smaller networks. In this article, we shall demonstrate a fundamental issue with this approach. Namely, we show that even if we are given all of the subnetworks induced on all proper subsets of the leaves of some rooted phylogenetic network, we still do not have all of the information required to completely determine that network. This implies that even if all of the building blocks for some reticulate evolutionary history were to be taken as the input for any given network building method, the method might still output an incorrect history. We also discuss some potential consequences of this result for constructing phylogenetic networks.
Assuntos
Classificação/métodos , Filogenia , Cryptococcus gattii/classificação , Cryptococcus gattii/genética , Modelos TeóricosRESUMO
Phylogenetic networks are a generalization of phylogenetic trees that are used to represent reticulate evolution. Unrooted phylogenetic networks form a special class of such networks, which naturally generalize unrooted phylogenetic trees. In this paper we define two operations on unrooted phylogenetic networks, one of which is a generalization of the well-known nearest-neighbor interchange (NNI) operation on phylogenetic trees. We show that any unrooted phylogenetic network can be transformed into any other such network using only these operations. This generalizes the well-known fact that any phylogenetic tree can be transformed into any other such tree using only NNI operations. It also allows us to define a generalization of tree space and to define some new metrics on unrooted phylogenetic networks. To prove our main results, we employ some fascinating new connections between phylogenetic networks and cubic graphs that we have recently discovered. Our results should be useful in developing new strategies to search for optimal phylogenetic networks, a topic that has recently generated some interest in the literature, as well as for providing new ways to compare networks.
Assuntos
Filogenia , Modelos BiológicosRESUMO
Phylogenetic networks are rooted, labelled directed acyclic graphswhich are commonly used to represent reticulate evolution. There is a close relationship between phylogenetic networks and multi-labelled trees (MUL-trees). Indeed, any phylogenetic network N can be "unfolded" to obtain a MUL-tree U(N) and, conversely, a MUL-tree T can in certain circumstances be "folded" to obtain aphylogenetic network F(T) that exhibits T. In this paper, we study properties of the operations U and F in more detail. In particular, we introduce the class of stable networks, phylogenetic networks N for which F(U(N)) is isomorphic to N, characterise such networks, and show that they are related to the well-known class of tree-sibling networks. We also explore how the concept of displaying a tree in a network N can be related to displaying the tree in the MUL-tree U(N). To do this, we develop aphylogenetic analogue of graph fibrations. This allows us to view U(N) as the analogue of the universal cover of a digraph, and to establish a close connection between displaying trees in U(N) and reconciling phylogenetic trees with networks.
Assuntos
Classificação/métodos , Modelos Genéticos , Filogenia , AlgoritmosRESUMO
Phylogenetic networks are a generalization of evolutionary or phylogenetic trees that are used to represent the evolution of species which have undergone reticulate evolution. In this paper we consider spaces of such networks defined by some novel local operations that we introduce for converting one phylogenetic network into another. These operations are modeled on the well-studied nearest-neighbor interchange operations on phylogenetic trees, and lead to natural generalizations of the tree spaces that have been previously associated to such operations. We present several results on spaces of some relatively simple networks, called level-1 networks, including the size of the neighborhood of a fixed network, and bounds on the diameter of the metric defined by taking the smallest number of operations required to convert one network into another. We expect that our results will be useful in the development of methods for systematically searching for optimal phylogenetic networks using, for example, likelihood and Bayesian approaches.
Assuntos
Evolução Biológica , Modelos Biológicos , Filogenia , Algoritmos , Teorema de Bayes , Biologia Computacional , Funções Verossimilhança , Conceitos MatemáticosRESUMO
In phylogenetics, a common strategy used to construct an evolutionary tree for a set of species [Formula: see text] is to search in the space of all such trees for one that optimizes some given score function (such as the minimum evolution, parsimony or likelihood score). As this can be computationally intensive, it was recently proposed to restrict such searches to the set of all those trees that are compatible with some circular ordering of the set [Formula: see text]. To inform the design of efficient algorithms to perform such searches, it is therefore of interest to find bounds for the number of trees compatible with a fixed ordering in the neighborhood of a tree that is determined by certain tree operations commonly used to search for trees: the nearest neighbor interchange (NNI), the subtree prune and regraft (SPR) and the tree bisection and reconnection (TBR) operations. We show that the size of such a neighborhood of a binary tree associated with the NNI operation is independent of the tree's topology, but that this is not the case for the SPR and TBR operations. We also give tight upper and lower bounds for the size of the neighborhood of a binary tree for the SPR and TBR operations and characterize those trees for which these bounds are attained.