Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Syst Biol ; 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38912803

RESUMO

The role of interspecific hybridization has recently seen increasing attention, especially in the context of diversification dynamics. Genomic research has now made it abundantly clear that both hybridization and introgression - the exchange of genetic material through hybridization and backcrossing - are far more common than previously thought. Besides cases of ongoing or recent genetic exchange between taxa, an increasing number of studies report "ancient introgression" - referring to results of hybridization that took place in the distant past. However, it is not clear whether commonly used methods for the detection of introgression are applicable to such old systems, given that most of these methods were originally developed for analyses at the level of populations and recently diverged species, affected by recent or ongoing genetic exchange. In particular, the assumption of constant evolutionary rates, which is implicit in many commonly used approaches, is more likely to be violated as evolutionary divergence increases. To test the limitations of introgression detection methods when being applied to old systems, we simulated thousands of genomic datasets under a wide range of settings, with varying degrees of among-species rate variation and introgression. Using these simulated datasets, we showed that some commonly applied statistical methods, including the D-statistic and certain tests based on sets of local phylogenetic trees, can produce false-positive signals of introgression between divergent taxa that have different rates of evolution. These misleading signals are caused by the presence of homoplasies occurring at different rates in different lineages. To distinguish between the patterns caused by rate variation and genuine introgression, we developed a new test that is based on the expected clustering of introgressed sites along the genome, and implemented this test in the program Dsuite.

2.
Proc Natl Acad Sci U S A ; 119(38): e2210604119, 2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36103580

RESUMO

Inferring the transmission direction between linked individuals living with HIV provides unparalleled power to understand the epidemiology that determines transmission. Phylogenetic ancestral-state reconstruction approaches infer the transmission direction by identifying the individual in whom the most recent common ancestor of the virus populations originated. While these methods vary in accuracy, it is unclear why. To evaluate the performance of phylogenetic ancestral-state reconstruction to determine the transmission direction of HIV-1 infection, we inferred the transmission direction for 112 transmission pairs where transmission direction and detailed additional information were available. We then fit a statistical model to evaluate the extent to which epidemiological, sampling, genetic, and phylogenetic factors influenced the outcome of the inference. Finally, we repeated the analysis under real-life conditions with only routinely available data. We found that whether ancestral-state reconstruction correctly infers the transmission direction depends principally on the phylogeny's topology. For example, under real-life conditions, the probability of identifying the correct transmission direction increases from 32%-when a monophyletic-monophyletic or paraphyletic-polyphyletic tree topology is observed and when the tip closest to the root does not agree with the state at the root-to 93% when a paraphyletic-monophyletic topology is observed and when the tip closest to the root agrees with the root state. Our results suggest that documenting larger differences in relative intrahost diversity increases our confidence in the transmission direction inference of linked pairs for population-level studies of HIV. These findings provide a practical starting point to determine our confidence in transmission direction inference from ancestral-state reconstruction.


Assuntos
Infecções por HIV , HIV-1 , Parceiros Sexuais , Feminino , Infecções por HIV/transmissão , Infecções por HIV/virologia , Humanos , Masculino , Modelos Estatísticos , Filogenia , Parceiros Sexuais/classificação
3.
Mol Phylogenet Evol ; 183: 107776, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36990305

RESUMO

Tree shape metrics can be computed fast for trees of any size, which makes them promising alternatives to intensive statistical methods and parameter-rich evolutionary models in the era of massive data availability. Previous studies have demonstrated their effectiveness in unveiling important parameters in viral evolutionary dynamics, although the impact of natural selection on the shape of tree topologies has not been thoroughly investigated. We carried out a forward-time and individual-based simulation to investigate whether tree shape metrics of several kinds could predict the selection regime employed to generate the data. To examine the impact of the genetic diversity of the founder viral population, simulations were run under two opposing starting configurations of the genetic diversity of the infecting viral population. We found that four evolutionary regimes, namely, negative, positive, and frequency-dependent selection, as well as neutral evolution, were successfully distinguished by tree topology shape metrics. Two metrics from the Laplacian spectral density profile (principal eigenvalue and peakedness) and the number of cherries were the most informative for indicating selection type. The genetic diversity of the founder population had an impact on differentiating evolutionary scenarios. Tree imbalance, which has been frequently associated with the action of natural selection on intrahost viral diversity, was also characteristic of neutrally evolving serially sampled data. Metrics calculated from empirical analysis of HIV datasets indicated that most tree topologies exhibited shapes closer to the frequency-dependent selection or neutral evolution regimes.


Assuntos
Evolução Biológica , Árvores , Filogenia , Simulação por Computador , Seleção Genética , Modelos Genéticos
4.
Sensors (Basel) ; 23(13)2023 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-37447799

RESUMO

Wireless sensor networks (WSNs) have been utilized as communication infrastructure for smart grid applications. The primary requirement of WSNs for smart grid applications is to transmit delay-critical data from smart grid assets ether at the maximum rate or by reducing collision rates. Additionally, WSNs should utilize the limited resources of the network to provide the required long-term QoS. The achievement of these objectives requires a remarkable design of WSN protocols to satisfy the requirements of smart grid applications. In this study, a multi-channel cluster tree protocol is proposed to prevent collisions and increase network performance. In the proposed scheme, the cluster head serves to broadcast a beacon frame containing information on the allocated channels and time slots. This enables the new node to determine its channel and timeslot. A performance analysis reveals that the proposed scheme can achieve a low end-to-end delay and low collision rates compared with the well-known IEEE 802.15.4 MAC protocols widely used in the literature to provide QoS to smart-grid applications.


Assuntos
Redes de Comunicação de Computadores , Tecnologia sem Fio
5.
Mol Phylogenet Evol ; 164: 107267, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34293395

RESUMO

Tetrapod taxa with broad geographic distributions across the Neotropics are often composed of multiple evolutionary lineages. In this paper, we present the most complete phylogeny of Leptophis to date and assess morphology-based species limits within the broadly distributed green parrot snake Leptophis ahaetulla sensu lato, which occurs from Mexico to Argentina. Although L. ahaetulla sensu stricto, L. nigromarginatus and L. occidentalis were recovered as paraphyletic, tree topology tests failed to reject their monophyly. Monophyly of L. bocourti, L. coeruleodorsus, L. cupreus, L. depressirostris, L. marginatus, L. riveti and L. sp. nov. was strongly supported. Our phylogenetic trees support recognition of multiple species within Leptophis ahaetulla sensu lato and suggest that color evolution and the uplift of the Andes played an important role in the diversification of parrot snakes.


Assuntos
Colubridae , Papagaios , Animais , Argentina , Colubridae/genética , México , Filogenia , Serpentes/genética
6.
Syst Biol ; 69(2): 280-293, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31504997

RESUMO

Bayesian Markov chain Monte Carlo explores tree space slowly, in part because it frequently returns to the same tree topology. An alternative strategy would be to explore tree space systematically, and never return to the same topology. In this article, we present an efficient parallelized method to map out the high likelihood set of phylogenetic tree topologies via systematic search, which we show to be a good approximation of the high posterior set of tree topologies on the data sets analyzed. Here, "likelihood" of a topology refers to the tree likelihood for the corresponding tree with optimized branch lengths. We call this method "phylogenetic topographer" (PT). The PT strategy is very simple: starting in a number of local topology maxima (obtained by hill-climbing from random starting points), explore out using local topology rearrangements, only continuing through topologies that are better than some likelihood threshold below the best observed topology. We show that the normalized topology likelihoods are a useful proxy for the Bayesian posterior probability of those topologies. By using a nonblocking hash table keyed on unique representations of tree topologies, we avoid visiting topologies more than once across all concurrent threads exploring tree space. We demonstrate that PT can be used directly to approximate a Bayesian consensus tree topology. When combined with an accurate means of evaluating per-topology marginal likelihoods, PT gives an alternative procedure for obtaining Bayesian posterior distributions on phylogenetic tree topologies.


Assuntos
Classificação/métodos , Filogenia , Algoritmos , Teorema de Bayes , Funções Verossimilhança
7.
Syst Biol ; 66(4): 499-516, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-27920231

RESUMO

The phylogeny of early gnathostomes provides an important framework for understanding one of the most significant evolutionary events, the origin and diversification of jawed vertebrates. A series of recent cladistic analyses have suggested that the placoderms, an extinct group of armoured fish, form a paraphyletic group basal to all other jawed vertebrates. We revised and expanded this morphological data set, most notably by sampling autapomorphies in a similar way to parsimony-informative traits, thus ensuring this data (unlike most existing morphological data sets) satisfied an important assumption of Bayesian tip-dated morphological clock approaches. We also found problems with characters supporting placoderm paraphyly, including character correlation and incorrect codings. Analysis of this data set reveals that paraphyly and monophyly of core placoderms (excluding maxillate forms) are essentially equally parsimonious. The two alternative topologies have different root positions for the jawed vertebrates but are otherwise similar. However, analysis using tip-dated clock methods reveals strong support for placoderm monophyly, due to this analysis favoring trees with more balanced rates of evolution. Furthermore, enforcing placoderm paraphyly results in higher levels and unusual patterns of rate heterogeneity among branches, similar to that generated from simulated trees reconstructed with incorrect root positions. These simulations also show that Bayesian tip-dated clock methods outperform parsimony when the outgroup is largely uninformative (e.g., due to inapplicable characters), as might be the case here. The analysis also reveals that gnathostomes underwent a rapid burst of evolution during the Silurian period which declined during the Early Devonian. This rapid evolution during a period with few articulated fossils might partly explain the difficulty in ascertaining the root position of jawed vertebrates.


Assuntos
Evolução Biológica , Classificação/métodos , Fósseis , Modelos Biológicos , Animais , Teorema de Bayes , Filogenia , Vertebrados
8.
Syst Biol ; 65(1): 161-76, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26231183

RESUMO

Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes.


Assuntos
Classificação/métodos , Modelos Teóricos , Filogenia , Algoritmos , Teorema de Bayes , Simulação por Computador , Cadeias de Markov , Método de Monte Carlo
9.
BMC Bioinformatics ; 17(1): 296, 2016 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-27473391

RESUMO

BACKGROUND: The Gene Ontology (GO) is a dynamic, controlled vocabulary that describes the cellular function of genes and proteins according to tree major categories: biological process, molecular function and cellular component. It has become widely used in many bioinformatics applications for annotating genes and measuring their semantic similarity, rather than their sequence similarity. Generally speaking, semantic similarity measures involve the GO tree topology, information content of GO terms, or a combination of both. RESULTS: Here we present a new semantic similarity measure called TopoICSim (Topological Information Content Similarity) which uses information on the specific paths between GO terms based on the topology of the GO tree, and the distribution of information content along these paths. The TopoICSim algorithm was evaluated on two human benchmark datasets based on KEGG pathways and Pfam domains grouped as clans, using GO terms from either the biological process or molecular function. The performance of the TopoICSim measure compared favorably to five existing methods. Furthermore, the TopoICSim similarity was also tested on gene/protein sets defined by correlated gene expression, using three human datasets, and showed improved performance compared to two previously published similarity measures. Finally we used an online benchmarking resource which evaluates any similarity measure against a set of 11 similarity measures in three tests, using gene/protein sets based on sequence similarity, Pfam domains, and enzyme classifications. The results for TopoICSim showed improved performance relative to most of the measures included in the benchmarking, and in particular a very robust performance throughout the different tests. CONCLUSIONS: The TopoICSim similarity measure provides a competitive method with robust performance for quantification of semantic similarity between genes and proteins based on GO annotations. An R script for TopoICSim is available at http://bigr.medisin.ntnu.no/tools/TopoICSim.R .


Assuntos
Biologia Computacional/métodos , Ontologia Genética , Algoritmos , Humanos , Anotação de Sequência Molecular , Semântica , Vocabulário Controlado
10.
Mol Biol Evol ; 32(6): 1611-27, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25660373

RESUMO

Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice.


Assuntos
Classificação/métodos , Modelos Genéticos , Filogenia , Bases de Dados Genéticas , Pesquisa Empírica , Evolução Molecular
11.
J Math Biol ; 73(5): 1251-1291, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27009067

RESUMO

We introduce two models for random trees with multiple states motivated by studies of trait dependence in the evolution of species. Our discrete time model, the multiple state ERM tree, is a generalization of Markov propagation models on a random tree generated by a binary search or 'equal rates Markov' mechanism. Our continuous time model, the multiple state Yule tree, is a generalization of the tree generated by a pure birth or Yule process to the tree generated by multi-type branching processes. We study state dependent topological properties of these two random tree models. We derive asymptotic results that allow one to infer model parameters from data on states at the leaves and at branch-points that are one step away from the leaves.


Assuntos
Classificação/métodos , Modelos Genéticos , Filogenia , Cadeias de Markov
12.
Sensors (Basel) ; 16(10)2016 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-27775571

RESUMO

This paper provides a performance evaluation of tree and mesh routing topologies of wireless sensor networks (WSNs) in a cultural heritage site. The historical site selected was San Juan Bautista church in Talamanca de Jarama (Madrid, Spain). We report the preliminary analysis required to study the effects of heating in this historical location using WSNs to monitor the temperature and humidity conditions during periods of weeks. To test which routing topology was better for this kind of application, the WSNs were first deployed on the upper floor of the CAEND institute in Arganda del Rey simulating the church deployment, but in the former scenario there was no direct line of sight between the WSN elements. Two parameters were selected to evaluate the performance of the routing topologies of WSNs: the percentage of received messages and the lifetime of the wireless sensor network. To analyze in more detail which topology gave the best performance, other communication parameters were also measured. The tree topology used was the collection tree protocol and the mesh topology was the XMESH provided by MEMSIC (Andover, MA, USA). For the scenarios presented in this paper, it can be concluded that the tree topology lost fewer messages than the mesh topology.


Assuntos
Monitoramento Ambiental/métodos , Tecnologia sem Fio , Algoritmos , Redes de Comunicação de Computadores , Modelos Teóricos
13.
PeerJ ; 12: e16706, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38213769

RESUMO

Recently, many studies have addressed the performance of phylogenetic tree-building methods (maximum parsimony, maximum likelihood, and Bayesian inference), focusing primarily on simulated data. However, for discrete morphological data, there is no consensus yet on which methods recover the phylogeny with better performance. To address this lack of consensus, we investigate the performance of different methods using an empirical dataset for hexapods as a model. As an empirical test of performance, we applied normalized indices to effectively measure accuracy (normalized Robinson-Foulds metric, nRF) and precision, which are measured via resolution, one minus Colless' consensus fork index (1-CFI). Additionally, to further explore phylogenetic accuracy and support measures, we calculated other statistics, such as the true positive rate (statistical power) and the false positive rate (type I error), and constructed receiver operating characteristic plots to visualize the relationship between these statistics. We applied the normalized indices to the reconstructed trees from the reanalyses of an empirical discrete morphological dataset from extant Hexapoda using a well-supported phylogenomic tree as a reference. Maximum likelihood and Bayesian inference applying the k-state Markov (Mk) model (without or with a discrete gamma distribution) performed better, showing higher precision (resolution). Additionally, our results suggest that most available tree topology tests are reliable estimators of the performance measures applied in this study. Thus, we suggest that likelihood-based methods and tree topology tests should be used more often in phylogenetic tree studies based on discrete morphological characters. Our study provides a fair indication that morphological datasets have robust phylogenetic signal.


Assuntos
Artrópodes , Animais , Filogenia , Funções Verossimilhança , Teorema de Bayes , Insetos
14.
PeerJ Comput Sci ; 10: e1932, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38660199

RESUMO

Data aggregation plays a critical role in sensor networks for efficient data collection. However, the assumption of uniform initial energy levels among sensors in existing algorithms is unrealistic in practical production applications. This discrepancy in initial energy levels significantly impacts data aggregation in sensor networks. To address this issue, we propose Data Aggregation with Different Initial Energy (DADIE), a novel algorithm that aims to enhance energy-saving, privacy-preserving efficiency, and reduce node death rates in sensor networks with varying initial energy nodes. DADIE considers the transmission distance between nodes and their initial energy levels when forming the network topology, while also limiting the number of child nodes. Furthermore, DADIE reconstructs the aggregation tree before each round of data transmission. This allows nodes closer to the receiving end with higher initial energy to undertake more data aggregation and transmission tasks while limiting energy consumption. As a result, DADIE effectively reduces the node death rate and improves the efficiency of data transmission throughout the network. To enhance network security, DADIE establishes secure transmission channels between transmission nodes prior to data transmission, and it employs slice-and-mix technology within the network. Our experimental simulations demonstrate that the proposed DADIE algorithm effectively resolves the data aggregation challenges in sensor networks with varying initial energy nodes. It achieves 5-20% lower communication overhead and energy consumption, 10-20% higher security, and 10-30% lower node mortality than existing algorithms.

15.
3 Biotech ; 9(6): 233, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31139548

RESUMO

Susumu Ohno hypothesized that the diversity of vertebrate gene families and genome is due to two rounds of whole genome duplications (also referred as 2R hypothesis). The quadruplicate paralogous blocks present on 1/2/8/20 chromosomes are taken as one of the evidences in favor of the 2R. In this study, we investigated that whether 2R has shaped the vertebrate evolution using gene families residing on chromosomes 1/2/8/20. Evolutionary history of 22 gene families (11 from the current study and 11 from the previous study) was evaluated by the phylogenetic analysis with triplicated or quadruplicated distribution on these human chromosomes 1/2/8/20. The phylogenetic analysis was performed using high-quality whole genomic sequence data of multiple species with neighbor-joining (NJ) and maximum likelihood (ML) methods. The phylogenetic tree topology of these gene families revealed variable duplication time points during invertebrate-vertebrate evolution. Topology comparison approach categorized 22 gene families into three groups. Tree topologies of ten gene families fell into Group 1 (duplications prior to invertebrate-vertebrate split), four in Group 2 (i.e., (AB) (C) (D), topology incongruent with 2R) and eight in Group 3 (((AB) (CD)), 2R congruent topology). Therefore, taken together the current and previous data of 1/2/8/20 paralogons, we propose that, in addition to whole genome duplications events, current developmental, morphological and genomic complexity of the vertebrate genomes may also have originated through segmental duplications occurring at varying time points during the course of animal evolution.

16.
Springerplus ; 5(1): 766, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27386252

RESUMO

A data center is a facility for housing computational and storage systems interconnected through a communication network called data center network (DCN). Due to a tremendous growth in the computational power, storage capacity and the number of inter-connected servers, the DCN faces challenges concerning efficiency, reliability and scalability. Although transmission control protocol (TCP) is a time-tested transport protocol in the Internet, DCN challenges such as inadequate buffer space in switches and bandwidth limitations have prompted the researchers to propose techniques to improve TCP performance or design new transport protocols for DCN. Data center TCP (DCTCP) emerge as one of the most promising solutions in this domain which employs the explicit congestion notification feature of TCP to enhance the TCP congestion control algorithm. While DCTCP has been analyzed for two-tier tree-based DCN topology for traffic between servers in the same rack which is common in cloud applications, it remains oblivious to the traffic patterns common in university and private enterprise networks which traverse the complete network interconnect spanning upper tier layers. We also recognize that DCTCP performance cannot remain unaffected by the underlying DCN architecture hence there is a need to test and compare DCTCP performance when implemented over diverse DCN architectures. Some of the most notable DCN architectures are the legacy three-tier, fat-tree, BCube, DCell, VL2, and CamCube. In this research, we simulate the two switch-centric DCN architectures; the widely deployed legacy three-tier architecture and the promising fat-tree architecture using network simulator and analyze the performance of DCTCP in terms of throughput and delay for realistic traffic patterns. We also examine how DCTCP prevents incast and outcast congestion when realistic DCN traffic patterns are employed in above mentioned topologies. Our results show that the underlying DCN architecture significantly impacts DCTCP performance. We find that DCTCP gives optimal performance in fat-tree topology and is most suitable for large networks.

17.
Comput Biol Chem ; 57: 61-71, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25861917

RESUMO

The topology or shape of evolutionary trees and their unbalanced nature has been a long standing area of interest in the field of phylogenetics. Coevolutionary analysis, which considers the evolutionary relationships between a pair of phylogenetic trees, has to date not considered leveraging this unbalanced nature as a means to reduce the complexity of coevolutionary analysis. In this work we apply previous analyses of tree shapes to improve the efficiency of inferring coevolutionary events. In particular, we use this prior research to derive a new data structure for inferring coevolutionary histories. Our new data structure is proven to provide a reduction in the time and space required to infer coevolutionary events. It is integrated into an existing framework for coevolutionary analysis and has been validated using both synthetic and previously published biological data sets. This proposed data structure performs twice as fast as algorithms implemented using existing data structures with no degradation in the algorithm's accuracy. As the coevolutionary data sets increase in size so too does the running time reduction provided by the newly proposed data structure. This is due to our data structure offering a logarithmic time and space complexity improvement. As a result, the proposed update to existing coevolutionary analysis algorithms outlined herein should enable the inference of larger coevolutionary systems in the future.

18.
Evol Bioinform Online ; 8: 489-525, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22915837

RESUMO

In the last two decades, a large number of whole-genome phylogenies have been inferred to reconstruct the Tree of Life (ToL). Underlying data models range from gene or functionality content in species to phylogenetic gene family trees and multiple sequence alignments of concatenated protein sequences. Diversity in data models together with the use of different tree reconstruction techniques, disruptive biological effects and the steadily increasing number of genomes have led to a huge diversity in published phylogenies. Comparison of those and, moreover, identification of the impact of inference properties (underlying data model, inference technique) on particular reconstructions is almost impossible. In this work, we introduce tree topology profiling as a method to compare already published whole-genome phylogenies. This method requires visual determination of the particular topology in a drawn whole-genome phylogeny for a set of particular bacterial clans. For each clan, neighborhoods to other bacteria are collected into a catalogue of generalized alternative topologies. Particular topology alternatives found for an ordered list of bacterial clans reveal a topology profile that represents the analyzed phylogeny. To simulate the inhomogeneity of published gene content phylogenies we generate a set of seven phylogenies using different inference techniques and the SYSTERS-PhyloMatrix data model. After tree topology profiling on in total 54 selected published and newly inferred phylogenies, we separate artefactual from biologically meaningful phylogenies and associate particular inference results (phylogenies) with inference background (inference techniques as well as data models). Topological relationships of particular bacterial species groups are presented. With this work we introduce tree topology profiling into the scientific field of comparative phylogenomics.

19.
Evolution ; 46(6): 1818-1826, 1992 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28567767

RESUMO

I examine patterns in tree balance for a sample of 208 cladograms and phenograms from the recent literature. I provide an expression for expected imbalance under a simple, uniform-rate random speciation model, and I estimate variances by simulation for the same model. Imbalance decreases with tree size (number of included taxa) in both theoretical and literature trees. In contrast to previous suggestions, I find cladistic trees to be no more imbalanced than phenetic trees when confounding variables are appropriately controlled. The degree of imbalance found in literature trees is inconsistent with the uniform-rate speciation model; this is most likely a result of variability in speciation and extinction rates among real lineages. The existence of such variation is a necessary (but not sufficient) condition for the operation of the macroevolutionary processes of species sorting and species selection.

20.
Evolution ; 50(6): 2141-2148, 1996 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28565665

RESUMO

Aspects of phylogenetic tree shape, and in particular tree balance, provide clues to the workings of the macroevolutionary process. I use a simulation approach to explore patterns in tree balance for several models of the evolutionary process under which speciation rates vary through the history of diversifying clades. I demonstrate that when speciation rates depend on an evolving trait of individuals, and are therefore "heritable" along evolutionary lineages, the resulting phylogenies become imbalanced. However, imbalance also results from some (but not all) models of "nonheritable" speciation rate variation. The degree of imbalance increases with the magnitude of speciation rate variation, and then for gradual evolution (but not punctuated equilibria) reaches an asymptote short of the theoretical maximum. Very high levels of rate variation are required to produce imbalance matching that found in real data (estimated phylogenies from the systematic literature). I discuss implications of the simulation results for our understanding of macroevolution.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA