RESUMO
Rapidly evolving taxa are excellent models for understanding the mechanisms that give rise to biodiversity. However, developing an accurate historical framework for comparative analysis of such lineages remains a challenge due to ubiquitous incomplete lineage sorting (ILS) and introgression. Here, we use a whole-genome alignment, multiple locus-sampling strategies, and summary-tree and single nucleotide polymorphism-based species-tree methods to infer a species tree for eastern North American Neodiprion species, a clade of pine-feeding sawflies (Order: Hymenopteran; Family: Diprionidae). We recovered a well-supported species tree that-except for three uncertain relationships-was robust to different strategies for analyzing whole-genome data. Nevertheless, underlying gene-tree discordance was high. To understand this genealogical variation, we used multiple linear regression to model site concordance factors estimated in 50-kb windows as a function of several genomic predictor variables. We found that site concordance factors tended to be higher in regions of the genome with more parsimony-informative sites, fewer singletons, less missing data, lower GC content, more genes, lower recombination rates, and lower D-statistics (less introgression). Together, these results suggest that ILS, introgression, and genotyping error all shape the genomic landscape of gene-tree discordance in Neodiprion. More generally, our findings demonstrate how combining phylogenomic analysis with knowledge of local genomic features can reveal mechanisms that produce topological heterogeneity across genomes.
Assuntos
Genoma de Inseto , Himenópteros , Filogenia , Animais , Himenópteros/genética , Himenópteros/classificação , Genoma de Inseto/genética , Classificação/métodosRESUMO
Variation in gene tree estimates is widely observed in empirical phylogenomic data and is often assumed to be the result of biological processes. However, a recent study using tetrapod mitochondrial genomes to control for biological sources of variation due to their haploid, uniparentally inherited, and non-recombining nature found that levels of discordance among mitochondrial gene trees were comparable to those found in studies that assume only biological sources of variation. Additionally, they found that several of the models of sequence evolution chosen to infer gene trees were doing an inadequate job fitting the sequence data. These results indicated that significant amounts of gene tree discordance in empirical data may be due to poor fit of sequence evolution models, and that more complex and biologically realistic models may be needed. To test how the fit of sequence evolution models relates to gene tree discordance, we analyzed the same mitochondrial datasets as the previous study using two additional, more complex models of sequence evolution that each includes a different biologically realistic aspect of the evolutionary process: a covarion model to incorporate site-specific rate variation across lineages (heterotachy), and a partitioned model to incorporate variable evolutionary patterns by codon position. Our results show that both additional models fit the data better than the models used in the previous study, with the covarion being consistently and strongly preferred as tree size increases. However, even these more preferred models still inferred highly discordant mitochondrial gene trees, thus deepening the mystery around what we label the "Mito-Phylo Paradox" and leading us to ask whether the observed variation could, in fact, be biological in nature after all.
RESUMO
Despite extensive morphological and molecular studies, the phylogenetic interrelationships within the infraorder Brachyura and the phylogenetic positions of many taxa remain uncertain. Studies that used a limited number of molecular markers have often failed to provide sufficient resolution, and may be susceptible to stochastic errors and incomplete lineage sorting (ILS). Here we reconstructed the phylogenetic relationships within the Brachyura using transcriptome data of 56 brachyuran species, including 14 newly sequenced taxa. Five supermatrices were constructed in order to exclude different sources of systematic error. The results of the phylogenetic analyses indicate that Heterotremata is non-monophyletic, and that the two Old World primary freshwater crabs (Potamidae and Gecarcinucidae) and the Hymenosomatoidea form a clade that is sister to the Thoracotremata, and outside the Heterotremata. We also found that ILS is the main cause of the gene-tree discordance of these freshwater crabs. Divergence time estimations indicate that the Brachyura has an ancient origin, probably either in the Triassic or Jurassic, and that the majority of extant families and superfamilies first appeared during the Cretaceous, with a constant increase of diversity in Post-Cretaceous-Palaeogene times. The results support the hypothesis that the two Old World freshwater crab families included in this study (Potamidae and Gecarcinucidae) diverged from their marine ancestors around 120 Ma, in the Cretaceous. In addition, this work provides new insights that may aid in the reclassification of some of the more problematic brachyuran groups.
RESUMO
The tribe Astereae (Asteraceae) includes 36 subtribes and 252 genera, and is distributed worldwide in temperate and tropical regions. One of the subtribes, Celmisiinae Saldivia, has been recently circumscribed to include six genera and ca. 160 species, and is restricted to eastern Australia, New Zealand, and New Guinea. The species show an impressive range of growth habit, from small herbs and ericoid subshrubs to medium-sized trees. They live in a wide range of habitats and are often dominant in subalpine and alpine vegetation. Despite the well-supported circumscription of Celmisiinae, uncertainties have remained about their internal relationships and classification at genus and species levels. This study exploited recent advances in high-throughput sequencing to build a robust multi-gene phylogeny for the subtribe Celmisiinae. The target enrichment Angiosperms353 bait set and the hybpiper-nf and paragone-nf pipelines were used to retrieve, infer, and assemble orthologous loci from 75 taxa representing all the main putative clades within the subtribe. Because of the diploidised ploidy level in Celmisiinae, as well as missing data in the assemblies, uncertainty remains surrounding the inference of orthology detection. However, based on a variety of gene-family sets, coalescent and concatenation-based phylogenetic reconstructions recovered similar topologies. Paralogy and missing data in the gene-families caused some problems, but the estimated phylogenies were well-supported and well-resolved. The phylogenomic evidence supported Celmisiinae and three main clades: the Pleurophyllum clade (Pleurophyllum, Macrolearia and Damnamenia), mostly in the New Zealand Subantarctic Islands, Celmisia of mainland New Zealand and Australia, and Shawia (including 'Olearia pro parte' and Pachystegia) of New Zealand, Australia and New Guinea. The results presented here add to the accumulating support for the Angiosperms353 bait set as an efficient method for documenting plant diversity.
Assuntos
Asteraceae , Humanos , Filogenia , Asteraceae/genética , Evolução Biológica , Austrália , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
Hordeum is an economically and evolutionarily important genus within the Triticeae tribe of the family Poaceae, and contains 33 widely distributed and diverse species which cytologically represent four subgenomes (H, Xa, Xu and I). These wild species (except Hordeum spontaneum, which is the primary gene pool of barley) are secondary or tertiary gene-pool germplasms for barley and wheat improvement, and uncovering their complicated evolutionary relationships would benefit for future breeding programs. Here, we developed a complexity-reduced pipeline via capturing genome-wide distributed fragments via two novel target-enriched assays (HorCap v1.0 and BarPlex v1.0) in conjugation with high-throughput sequencing of the enrichments. Both assays were tested for genotyping 40 species from three genera (Hordeum, Triticum, and Aegilops) containing 82 samples 67 accessions. Either of both assays worked efficiently in genotyping, while integration of both assays can significantly improve the robustness and resolution of the Hordeum phylogenetic trees. Interestingly, the incomplete lineage sorting (ILS) was inferred for the first time as the major factor causing phylogenetic discordance among the four subgenomes, whereas in New World species (carrying I genome) post-speciation introgression events were revealed. Through revising the evolutionary relationships of the Hordeum species based on an ancestral state reconstruction for the diploids and parental donor inference for the polyploids, our results raised new queries about the Hordeum phylogeny. Moreover, both newly-developed assays are applicable in genotyping and phylogenetic analysis of Hordeum and other Triticeae wild species.
Assuntos
Hordeum , Filogenia , Hordeum/genética , Hordeum/classificação , Sequenciamento de Nucleotídeos em Larga Escala , Triticum/genética , Triticum/classificação , Genoma de Planta , Genótipo , Aegilops/genética , Aegilops/classificação , Análise de Sequência de DNARESUMO
Evolutionary biologists have long been fascinated with the episodes of rapid phenotypic innovation that underlie the emergence of major lineages. Although our understanding of the environmental and ecological contexts of such episodes has steadily increased, it has remained unclear how population processes contribute to emergent macroevolutionary patterns. One insight gleaned from phylogenomics is that gene-tree conflict, frequently caused by population-level processes, is often rampant during the origin of major lineages. With the understanding that phylogenomic conflict is often driven by complex population processes, we hypothesized that there may be a direct correspondence between instances of high conflict and elevated rates of phenotypic innovation if both patterns result from the same processes. We evaluated this hypothesis in six clades spanning vertebrates and plants. We found that the most conflict-rich regions of these six clades also tended to experience the highest rates of phenotypic innovation, suggesting that population processes shaping both phenotypic and genomic evolution may leave signatures at deep timescales. Closer examination of the biological significance of phylogenomic conflict may yield improved connections between micro- and macroevolution and increase our understanding of the processes that shape the origin of major lineages across the Tree of Life.
Assuntos
Aves/genética , Genômica/métodos , Mamíferos/genética , Filogenia , Plantas/genética , Animais , Aves/anatomia & histologia , Aves/classificação , Evolução Molecular , Genômica/estatística & dados numéricos , Mamíferos/anatomia & histologia , Mamíferos/classificação , Fenótipo , Plantas/anatomia & histologia , Plantas/classificação , Especificidade da EspécieRESUMO
Molecular data have been used to date species divergences ever since they were described as documents of evolutionary history in the 1960s. Yet, an inadequate fossil record and discordance between gene trees and species trees are persistently problematic. We examine how, by accommodating gene tree discordance and by scaling branch lengths to absolute time using mutation rate and generation time, multispecies coalescent (MSC) methods can potentially overcome these challenges. We find that time estimates can differ - in some cases, substantially - depending on whether MSC methods or traditional phylogenetic methods that apply concatenation are used, and whether the tree is calibrated with pedigree-based mutation rates or with fossils. We discuss the advantages and shortcomings of both approaches and provide practical guidance for data analysis when using these methods.
Assuntos
Evolução Biológica , Fósseis , Mamíferos/classificação , Mamíferos/genética , Modelos Teóricos , Taxa de Mutação , Filogenia , Animais , Fluxo Gênico , Modelos GenéticosRESUMO
Gene tree discordance is a significant legacy of biological evolution. Multiple factors can result in incongruence among genes, such as introgression, incomplete lineage sorting (ILS), gene duplication or loss. Resolving the background of gene tree discordance is a critical way to uncover the process of species diversification. Camellia, the largest genus in Theaceae, has controversial taxonomy and systematics due in part to a complex evolutionary history. We used 60 transcriptomes of 55 species, which represented 15 sections of Camellia to investigate its phylogeny and the possible causes of gene tree discordance. We conducted gene tree discordance analysis based on 1,617 orthologous low-copy nuclear genes, primarily using coalescent species trees and polytomy tests to distinguish hard and soft conflict. A selective pressure analysis was also performed to assess the impact of selection on phylogenetic topology reconstruction. Our results detected different levels of gene tree discordance in the backbone of Camellia, and recovered rapid diversification as one of the possible causes of gene tree discordance. Furthermore, we confirmed that none of the currently proposed sections of Camellia was monophyletic. Comparisons among datasets partitioned under different selective pressure regimes showed that integrating all orthologous genes provided the best phylogenetic resolution of the species tree of Camellia. The findings of this study reveal rapid diversification as a major source of gene tree discordance in Camellia and will facilitate future investigation of reticulate relationships at the species level in this important plant genus.
Assuntos
Camellia , Theaceae , Camellia/genética , Filogenia , Evolução Biológica , Duplicação GênicaRESUMO
North American Thamnophiini (gartersnakes, watersnakes, brownsnakes, and swampsnakes) are an ecologically and phenotypically diverse temperate clade of snakes representing 61 species across 10 genera. In this study, we estimate phylogenetic trees using â¼3,700 ultraconserved elements (UCEs) for 76 specimens representing 75% of all Thamnophiini species. We infer phylogenies using multispecies coalescent methods and time calibrate them using the fossil record. We also conducted ancestral area estimation to identify how major biogeographic boundaries in North America affect broadscale diversification in the group. While most nodes exhibited strong statistical support, analysis of concordant data across gene trees reveals substantial heterogeneity. Ancestral area estimation demonstrated that the genus Thamnophis was the only taxon in this subfamily to cross the Western Continental Divide, even as other taxa dispersed southward toward the tropics. Additionally, levels of gene tree discordance are overall higher in transition zones between bioregions, including the Rocky Mountains. Therefore, the Western Continental Divide may be a significant transition zone structuring the diversification of Thamnophiini during the Neogene and Pleistocene. Here we show that despite high levels of discordance across gene trees, we were able to infer a highly resolved and well-supported phylogeny for Thamnophiini, which allows us to understand broadscale patterns of diversity and biogeography.
Assuntos
Colubridae , Animais , Filogenia , América do NorteRESUMO
The commercial strawberry, Fragaria × ananassa, is a recent allo-octoploid that is cultivated worldwide. However, other than Fragaria vesca, which is universally accepted one of its diploid ancestors, its other early diploid progenitors remain unclear. Here, we performed comparative analyses of the genomes of five diploid strawberries, F. iinumae, F. vesca, F. nilgerrensis, F. nubicola, and F. viridis, of which the latter three are newly sequenced. We found that the genomes of these species share highly conserved gene content and gene order. Using an alignment-based approach, we show that F. iinumae and F. vesca are the diploid progenitors to the octoploid F. × ananassa, whereas the other three diploids that we analyzed in this study are not parental species. We generated a fully resolved, dated phylogeny of Fragaria, and determined that the genus arose â¼6.37 Ma. Our results effectively resolve conflicting hypotheses regarding the putative diploid progenitors of the cultivated strawberry, establish a reliable backbone phylogeny for the genus, and provide genetic resources for molecular breeding.
Assuntos
Diploide , Fragaria/genética , Genoma de Planta , Hibridização Genética , Filogenia , Domesticação , PoliploidiaRESUMO
BACKGROUND: Plastid genomes (plastomes) present great potential in resolving multiscale phylogenetic relationship but few studies have focused on the influence of genetic characteristics of plastid genes, such as genetic variation and phylogenetic discordance, in resolving the phylogeny within a lineage. Here we examine plastome characteristics of Cycas L., the most diverse genus among extant cycads, and investigate the deep phylogenetic relationships within Cycas by sampling 47 plastomes representing all major clades from six sections. RESULTS: All Cycas plastomes shared consistent gene content and structure with only one gene loss detected in Philippine species C. wadei. Three novel plastome regions (psbA-matK, trnN-ndhF, chlL-trnN) were identified as containing the highest nucleotide variability. Molecular evolutionary analysis showed most of the plastid protein-coding genes have been under purifying selection except ndhB. Phylogenomic analyses that alternatively included concatenated and coalescent methods, both identified four clades but with conflicting topologies at shallow nodes. Specifically, we found three species-rich Cycas sections, namely Stangerioides, Indosinenses and Cycas, were not or only weakly supported as monophyly based on plastomic phylogeny. Tree space analyses based on different tree-inference methods both revealed three gene clusters, of which the cluster with moderate genetic properties showed the best congruence with the favored phylogeny. CONCLUSIONS: Our exploration in plastomic data for Cycas supports the idea that plastid protein-coding genes may exhibit discordance in phylogenetic signals. The incongruence between molecular phylogeny and morphological classification reported here may largely be attributed to the uniparental attribute of plastid, which cannot offer sufficient information to resolve the phylogeny. Contrasting to a previous consensus that genes with longer sequences and a higher proportion of variances are superior for phylogeny reconstruction, our result implies that the most effective phylogenetic signals could come from loci that own moderate variation, GC content, sequence length, and underwent modest selection.
Assuntos
Cycas , Genomas de Plastídeos , Cycadopsida/genética , Genomas de Plastídeos/genética , Filogenia , Plastídeos/genéticaRESUMO
BACKGROUND AND AIMS: Abelia (Caprifoliaceae) is a small genus with five species, including one artificial hybrid and several natural hybrids. The genus has a discontinuous distribution in Mainland China, Taiwan Island and the Ryukyu Islands, providing a model system to explore the mechanisms of species dispersal in the East Asian flora. However, the current phylogenetic relationships within Abelia remain uncertain. METHODS: We reconstructed the phylogenetic relationships within Abelia using nuclear loci generated by target enrichment and plastomes from genome skimming. Divergence time estimation, ancestral area reconstruction and ecological niche modelling (ENM) were used to examine the diversification history of Abelia. KEY RESULTS: We found extensive cytonuclear discordance across the genus. By integrating lines of evidence from molecular phylogenies, divergence times and morphology, we propose to merge Abelia macrotera var. zabelioides into A. uniflora. Network analyses suggested that there have been multiple widespread hybridization events among Abelia species. These hybridization events may have contributed to the speciation mechanism and resulted in the high observed morphological diversity. The diversification of Abelia began in the early Eocene, followed by A. chinensis var. ionandra colonizing Taiwan Island during the Middle Miocene. The ENM results suggested an expansion of climatically suitable areas during the Last Glacial Maximum and range contraction during the Last Interglacial. Disjunction between the Himalayan-Hengduan Mountain region and Taiwan Island is probably the consequence of topographical isolation and postglacial contraction. CONCLUSIONS: We used genomic data to reconstruct the phylogeny of Abelia and found a clear pattern of reticulate evolution in the group. In addition, our results suggest that shrinkage of postglacial range and the heterogeneity of the terrain have led to the disjunction between Mainland China and Taiwan Island. This study provides important new insights into the speciation process and taxonomy of Abelia.
Assuntos
Caprifoliaceae , China , Ecossistema , Hibridização Genética , Filogenia , FilogeografiaRESUMO
Incomplete lineage sorting (ILS) is an important factor that causes gene tree discordance. For gene trees of three species, under neutrality, random mating, and the absence of interspecific gene flow, ILS creates a symmetric distribution of gene trees: the gene tree that accords with the species tree has the highest frequency, and the two discordant trees are equally frequent. If the neutral condition is violated, the impact of ILS may change, altering the gene tree distribution. Here, we show that under purifying selection, even assuming that the fitness effect of mutations is constant throughout the species tree, if differences in population size exist among species, asymmetric distributions of gene trees will arise, which is different from the expectation under neutrality. In extremes, one of the discordant trees rather than the concordant tree becomes the most frequent gene tree. In addition, we found that in a real case, the position of Scandentia relative to Primate and Glires, the symmetry in the gene tree distribution can be influenced by the strength of purifying selection. In current phylogenetic inference, the impact of purifying selection on the gene tree distribution is rarely considered by researchers. This study highlights the necessity of considering this impact.
Assuntos
Biologia Computacional/métodos , Primatas/genética , Roedores/genética , Escandêntias/genética , Animais , Evolução Molecular , Fluxo Gênico , Especiação Genética , Modelos Genéticos , Filogenia , Densidade Demográfica , Seleção GenéticaRESUMO
BACKGROUND: Sequence data used in reconstructing phylogenetic trees may include various sources of error. Typically errors are detected at the sequence level, but when missed, the erroneous sequences often appear as unexpectedly long branches in the inferred phylogeny. RESULTS: We propose an automatic method to detect such errors. We build a phylogeny including all the data then detect sequences that artificially inflate the tree diameter. We formulate an optimization problem, called the k-shrink problem, that seeks to find k leaves that could be removed to maximally reduce the tree diameter. We present an algorithm to find the exact solution for this problem in polynomial time. We then use several statistical tests to find outlier species that have an unexpectedly high impact on the tree diameter. These tests can use a single tree or a set of related gene trees and can also adjust to species-specific patterns of branch length. The resulting method is called TreeShrink. We test our method on six phylogenomic biological datasets and an HIV dataset and show that the method successfully detects and removes long branches. TreeShrink removes sequences more conservatively than rogue taxon removal and often reduces gene tree discordance more than rogue taxon removal once the amount of filtering is controlled. CONCLUSIONS: TreeShrink is an effective method for detecting sequences that lead to unrealistically long branch lengths in phylogenetic trees. The tool is publicly available at https://github.com/uym2/TreeShrink .
Assuntos
Algoritmos , Biologia Computacional/métodos , Mamíferos/classificação , Mamíferos/genética , Filogenia , Software , Animais , Conjuntos de Dados como Assunto , Genes , Humanos , Modelos Genéticos , Plantas/classificação , Plantas/genética , Especificidade da EspécieRESUMO
Many cases of rapid evolutionary radiations in plant and animal lineages are known; however phylogenetic relationships among these lineages have been difficult to resolve by systematists. Increasing amounts of genomic data have been sequentially applied in an attempt to resolve these radiations, dissecting their evolutionary patterns into a series of bifurcating events. Here we explore one such rapid radiation in the tropical plant order Zingiberales (the bananas and relatives) which includes eight families, approximately 110 genera, and more than 2600 species. One clade, the "Ginger families", including (Costaceaeâ¯+â¯Zingiberaceae) (Marantaceaeâ¯+â¯Cannaceae), has been well-resolved and well-supported in all previous studies. However, well-supported reconstructions among the "Banana families" (Musaceae, Heliconiaceae, Lowiaceae, Strelitziaceae), which most likely diverged about 90 Mya, have been difficult to confirm. Supported with anatomical, morphological, single locus, and genome-wide data, nearly every possible phylogenetic placement has been proposed for these families. In an attempt to resolve this complex evolutionary event, hybridization-based target enrichment was used to obtain sequences from up to 378 putatively orthologous low-copy nuclear genes (allâ¯≥â¯960â¯bp). Individual gene trees recovered multiple topologies among the early divergent lineages, with varying levels of support for these relationships. One topology of the "Banana families" (Musaceae (Heliconiaceae (Lowiaceaeâ¯+â¯Strelitziaceae))), which has not been suggested until now, was almost consistently recovered in all multilocus analyses of the nuclear dataset (concatenated - ExaML, coalescent - ASTRAL and ASTRID, supertree - MRL, and Bayesian concordance - BUCKy). Nevertheless, the multiple topologies recovered among these lineages suggest that even large amounts of genomic data might not be able to fully resolve relationships at this phylogenetic depth. This lack of well-supported resolution could suggest methodological problems (i.e., violation of model assumptions in both concatenated and coalescent analyses) or more likely reflect an evolutionary history shaped by an explosive, rapid, and nearly simultaneous polychotomous radiation in this group of plants towards the end of the Cretaceous, perhaps driven by vertebrate pollinator selection.
Assuntos
Genômica , Filogenia , Clima Tropical , Zingiberales/classificação , Zingiberales/genética , Teorema de Bayes , Núcleo Celular/genética , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Fases de Leitura Aberta/genéticaRESUMO
Phylogenetic relationships in species complexes and lineages derived from rapid diversifications are often challenging to resolve using morphology or standard DNA barcoding markers. The hyper-diverse genus Lepanthes from Neotropical cloud forest includes over 1200 species and many recent, explosive diversifications that have resulted in poorly supported nodes and morphological convergence across clades. Here, we assess the performance of 446 nuclear-plastid-mitochondrial markers derived from an anchored hybrid enrichment approach (AHE) coupled with coalescence- and species network-based inferences to resolve phylogenetic relationships and improve species recognition in the Lepanthes horrida species group. In addition to using orchid-specific probes to increase enrichment efficiency, we improved gene tree resolution by extending standard angiosperm targets into adjacent exons. We found high topological discordance among individual gene trees, suggesting that hybridization/polyploidy may have promoted speciation in the lineage via formation of new hybrid taxa. In addition, we identified ten loci with the highest phylogenetic informativeness values from these genomes. Most previous phylogenetic sampling in the Pleurothallidinae relies on two regions (ITS and matK), therefore, the evaluation of other markers such as those shown here may be useful in future phylogenetic studies in the orchid family. Coalescent-based species tree estimation methods resolved the phylogenetic relationships of the L. horrida species group. The resolution of the phylogenetic estimations was improved with the inclusion of extended anchor targets. This approach produced longer loci with higher discriminative power. These analyses also disclosed two undescribed species, L. amicitiae and L. genetoapophantica, formally described here, which are also supported by morphology. Our study demonstrates the utility of combined genomic evidence to disentangle phylogenetic relationships at very shallow levels of the tree of life, and in clades showing convergent trait evolution. With a fully resolved phylogeny, is it possible to disentangle traits evolving in parallel or convergently across these orchid lineages such as flower color and size from diagnostic traits such as the shape and orientation of the lobes of the petals and lip.
Assuntos
Núcleo Celular/genética , Hibridização Genética , Mitocôndrias/genética , Orchidaceae/genética , Plastídeos/genética , Análise por Conglomerados , Bases de Dados Genéticas , Flores/anatomia & histologia , Loci Gênicos , Marcadores Genéticos , Funções Verossimilhança , Filogenia , Especificidade da EspécieRESUMO
Phenotypic convergence is an exciting outcome of adaptive evolution, occurring when different species find similar solutions to the same problem. Unraveling the molecular basis of convergence provides a way to link genotype to adaptive phenotypes, but can also shed light on the extent to which molecular evolution is repeatable and predictable. Many recent genome-wide studies have uncovered a striking pattern of diminishing convergence over time, ascribing this pattern to the presence of intramolecular epistatic interactions. Here, we consider gene tree discordance as an alternative cause of changes in convergence levels over time in a primate dataset. We demonstrate that gene tree discordance can produce patterns of diminishing convergence by itself, and that controlling for discordance as a cause of apparent convergence makes the pattern disappear. We also show that synonymous substitutions, where neither selection nor epistasis should be prevalent, have the same diminishing pattern of molecular convergence in primates. Finally, we demonstrate that even in situations where biological discordance is not possible, discordance due to errors in species tree inference can drive similar patterns. Though intramolecular epistasis could in principle create a pattern of declining convergence over time, our results suggest a possible alternative explanation for this widespread pattern. These results contribute to a growing appreciation not just of the presence of gene tree discordance, but of the unpredictable effects this discordance can have on analyses of molecular evolution.
Assuntos
Evolução Molecular , Estudos de Associação Genética/métodos , Variação Genética , Animais , Evolução Biológica , Epistasia Genética , Especiação Genética , Genoma , Genótipo , Modelos Genéticos , Filogenia , Primatas/genéticaRESUMO
The horned toad assemblage, genus Megophrys sensu lato, currently includes three groups previously recognized as the genera Atympanophrys, Xenophrys and Megophrys sensu stricto. The taxonomic status and species composition of the three groups remain controversial due to conflicting phenotypic analyses and insufficient phylogenetic reconstruction; likewise, the position of the monotypic Borneophrys remains uncertain with respect to the horned toads. Further, the diversity of the horned toads remains poorly understood, especially for widespread species. Herein, we evaluate species-level diversity based on 45 of the 57 described species from throughout southern China, Southeast Asia and the Himalayas using Bayesian inference trees and the Generalized Mixed Yule Coalescent (GMYC) approach. We estimate the phylogeny using both mitochondrial and nuclear DNA data. Analyses reveal statistically significant mito-nuclear discordance. All analyses resolve paraphyly for horned toads involving multiple strongly supported clades. These clades correspond with geography. We resurrect the genera Atympanophrys and Xenophrys from the synonymy of Megophrys to eliminate paraphyly of Megophrys s.l. and to account for the morphological, molecular and biogeographic differences among these groups, but we also provide an alternative option. Our study suggests that Borneophrys is junior synonym of Megophrys sensu stricto. We provide an estimation of timeframe for the horned toads. The mitochondrial and nuclear trees indicate the presence of many putative undescribed species. Widespread species, such as Xenophrys major and X. minor, likely have dramatically underestimated diversity. The integration of morphological and molecular evidence can validate this discovery. Montane forest dynamics appear to play a significant role in driving diversification of horned toads.
Assuntos
Anuros/classificação , Animais , Anuros/genética , Teorema de Bayes , Bufonidae/classificação , Bufonidae/genética , China , DNA/química , DNA/isolamento & purificação , DNA/metabolismo , DNA Mitocondrial/classificação , DNA Mitocondrial/genética , Filogenia , Filogeografia , RNA Ribossômico 16S/química , RNA Ribossômico 16S/genética , Análise de Sequência de DNARESUMO
Substitution rates are known to be variable among genes, chromosomes, species, and lineages due to multifarious biological processes. Here, we consider another source of substitution rate variation due to a technical bias associated with gene tree discordance. Discordance has been found to be rampant in genome-wide data sets, often due to incomplete lineage sorting (ILS). This apparent substitution rate variation is caused when substitutions that occur on discordant gene trees are analyzed in the context of a single, fixed species tree. Such substitutions have to be resolved by proposing multiple substitutions on the species tree, and we therefore refer to this phenomenon as Substitutions Produced by ILS (SPILS). We use simulations to demonstrate that SPILS has a larger effect with increasing levels of ILS, and on trees with larger numbers of taxa. Specific branches of the species trees are consistently, but erroneously, inferred to be longer or shorter, and we show that these branches can be predicted based on discordant tree topologies. Moreover, we observe that fixing a species tree topology when performing tests of positive selection increases the false positive rate, particularly for genes whose discordant topologies are most affected by SPILS. Finally, we use data from multiple Drosophila species to show that SPILS can be detected in nature. Although the effects of SPILS are modest per gene, it has the potential to affect substitution rate variation whenever high levels of ILS are present, particularly in rapid radiations. The problems outlined here have implications for character mapping of any type of trait, and for any biological process that causes discordance. We discuss possible solutions to these problems, and areas in which they are likely to have caused faulty inferences of convergence and accelerated evolution.
Assuntos
Evolução Molecular , Genoma , Modelos Genéticos , Substituição de Aminoácidos , Animais , Drosophila/genética , FilogeniaRESUMO
Rickettsia is a genus of intracellular bacteria whose hosts and transmission strategies are both impressively diverse, and this is reflected in a highly dynamic genome. Some previous studies have described the evolutionary history of Rickettsia as non-tree-like, due to incongruity between phylogenetic reconstructions using different portions of the genome. Here, we reconstruct the Rickettsia phylogeny using whole-genome data, including two new genomes from previously unsampled host groups. We find that a single topology, which is supported by multiple sources of phylogenetic signal, well describes the evolutionary history of the core genome. We do observe extensive incongruence between individual gene trees, but analyses of simulations over a single topology and interspersed partitions of sites show that this is more plausibly attributed to systematic error than to horizontal gene transfer. Some conflicting placements also result from phylogenetic analyses of accessory genome content (i.e., gene presence/absence), but we argue that these are also due to systematic error, stemming from convergent genome reduction, which cannot be accommodated by existing phylogenetic methods. Our results show that, even within a single genus, tests for gene exchange based on phylogenetic incongruence may be susceptible to false positives.