Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Cancer Discov ; 14(10): 1810-1822, 2024 Oct 04.
Artículo en Inglés | MEDLINE | ID: mdl-38943574

RESUMEN

Tumors frequently display high chromosomal instability and contain multiple copies of genomic regions. Here, we describe Gain Route Identification and Timing In Cancer (GRITIC), a generic method for timing genomic gains leading to complex copy number states, using single-sample bulk whole-genome sequencing data. By applying GRITIC to 6,091 tumors, we found that non-parsimonious evolution is frequent in the formation of complex copy number states in genome-doubled tumors. We measured chromosomal instability before and after genome duplication in human tumors and found that late genome doubling was followed by an increase in the rate of copy number gain. Copy number gains often accumulate as punctuated bursts, commonly after genome doubling. We infer that genome duplications typically affect the landscape of copy number losses, while only minimally impacting copy number gains. In summary, GRITIC is a novel copy number gain timing framework that permits the analysis of copy number evolution in chromosomally unstable tumors. Significance: Complex genomic gains are associated with whole-genome duplications, which are frequent across tumors, span a large fraction of their genomes, and are linked to poorer outcomes. GRITIC infers when these gains occur during tumor development, which will help to identify the genetic events that drive tumor evolution. See related commentary by Taylor, p. 1766.


Asunto(s)
Inestabilidad Cromosómica , Variaciones en el Número de Copia de ADN , Neoplasias , Humanos , Neoplasias/genética , Genoma Humano
2.
Nature ; 629(8013): 851-860, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38560995

RESUMEN

Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method and the choice of genomic regions1-3. Here we address these issues by analysing the genomes of 363 bird species4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a marked degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous-Palaeogene boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that are a challenge to model due to either extreme DNA composition, variable substitution rates, incomplete lineage sorting or complex evolutionary events such as ancient hybridization. Assessment of the effects of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates and relative brain size following the Cretaceous-Palaeogene extinction event, supporting the hypothesis that emerging ecological opportunities catalysed the diversification of modern birds. The resulting phylogenetic estimate offers fresh insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.


Asunto(s)
Aves , Evolución Molecular , Genoma , Filogenia , Animales , Aves/genética , Aves/clasificación , Aves/anatomía & histología , Encéfalo/anatomía & histología , Extinción Biológica , Genoma/genética , Genómica , Densidad de Población , Masculino , Femenino
3.
Nat Commun ; 14(1): 8262, 2023 Dec 13.
Artículo en Inglés | MEDLINE | ID: mdl-38092737

RESUMEN

Cancers develop and progress as mutations accumulate, and with the advent of single-cell DNA and RNA sequencing, researchers can observe these mutations and their transcriptomic effects and predict proteomic changes with remarkable temporal and spatial precision. However, to connect genomic mutations with their transcriptomic and proteomic consequences, cells with either only DNA data or only RNA data must be mapped to a common domain. For this purpose, we present MaCroDNA, a method that uses maximum weighted bipartite matching of per-gene read counts from single-cell DNA and RNA-seq data. Using ground truth information from colorectal cancer data, we demonstrate the advantage of MaCroDNA over existing methods in accuracy and speed. Exemplifying the utility of single-cell data integration in cancer research, we suggest, based on results derived using MaCroDNA, that genomic mutations of large effect size increasingly contribute to differential expression between cells as Barrett's esophagus progresses to esophageal cancer, reaffirming the findings of the previous studies.


Asunto(s)
Adenocarcinoma , Esófago de Barrett , Neoplasias Esofágicas , Humanos , Adenocarcinoma/genética , ARN/genética , Proteómica , Esófago de Barrett/genética , Neoplasias Esofágicas/patología , ADN
4.
Genome Biol Evol ; 15(6)2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37243541

RESUMEN

The evolutionary histories of individual loci in a genome can be estimated independently, but this approach is error-prone due to the limited amount of sequence data available for each gene, which has led to the development of a diverse array of gene tree error correction methods which reduce the distance to the species tree. We investigate the performance of two representatives of these methods: TRACTION and TreeFix. We found that gene tree error correction frequently increases the level of error in gene tree topologies by "correcting" them to be closer to the species tree, even when the true gene and species trees are discordant. We confirm that full Bayesian inference of the gene trees under the multispecies coalescent model is more accurate than independent inference. Future gene tree correction approaches and methods should incorporate an adequately realistic model of evolution instead of relying on oversimplified heuristics.


Asunto(s)
Genoma , Modelos Genéticos , Filogenia , Teorema de Bayes
5.
Mol Phylogenet Evol ; 181: 107724, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36720421

RESUMEN

Accurate inference of population parameters plays a pivotal role in unravelling evolutionary histories. While recombination has been universally accepted as a fundamental process in the evolution of sexually reproducing organisms, it remains challenging to model it exactly. Thus, existing coalescent-based approaches make different assumptions or approximations to facilitate phylogenetic inference, which can potentially bring about biases in estimates of evolutionary parameters when recombination is present. In this article, we evaluate the performance of population parameter estimation using three methods-StarBEAST2, SNAPP, and diCal2-that represent three different types of inference. We performed whole-genome simulations in which recombination rates, mutation rates, and levels of incomplete lineage sorting were varied. We show that StarBEAST2 using short or medium-sized loci is robust to realistic rates of recombination, which is in agreement with previous studies. SNAPP, as expected, is generally unaffected by recombination events. Most surprisingly, diCal2, a method that is designed to explicitly account for recombination, performs considerably worse than other methods under comparison.


Asunto(s)
Genoma , Tasa de Mutación , Filogenia , Recombinación Genética , Modelos Genéticos , Simulación por Computador
6.
Bioinformatics ; 38(Suppl 1): i195-i202, 2022 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-35758771

RESUMEN

MOTIVATION: Single-nucleotide variants (SNVs) are the most common variations in the human genome. Recently developed methods for SNV detection from single-cell DNA sequencing data, such as SCIΦ and scVILP, leverage the evolutionary history of the cells to overcome the technical errors associated with single-cell sequencing protocols. Despite being accurate, these methods are not scalable to the extensive genomic breadth of single-cell whole-genome (scWGS) and whole-exome sequencing (scWES) data. RESULTS: Here, we report on a new scalable method, Phylovar, which extends the phylogeny-guided variant calling approach to sequencing datasets containing millions of loci. Through benchmarking on simulated datasets under different settings, we show that, Phylovar outperforms SCIΦ in terms of running time while being more accurate than Monovar (which is not phylogeny-aware) in terms of SNV detection. Furthermore, we applied Phylovar to two real biological datasets: an scWES triple-negative breast cancer data consisting of 32 cells and 3375 loci as well as an scWGS data of neuron cells from a normal human brain containing 16 cells and approximately 2.5 million loci. For the cancer data, Phylovar detected somatic SNVs with high or moderate functional impact that were also supported by bulk sequencing dataset and for the neuron dataset, Phylovar identified 5745 SNVs with non-synonymous effects some of which were associated with neurodegenerative diseases. AVAILABILITY AND IMPLEMENTATION: Phylovar is implemented in Python and is publicly available at https://github.com/NakhlehLab/Phylovar.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Nucleótidos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Filogenia , Análisis de Secuencia de ADN
7.
PLoS Comput Biol ; 18(6): e1010216, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35675326

RESUMEN

Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences (MHGs) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.


Asunto(s)
Genoma Bacteriano , Células Procariotas , Transferencia de Gen Horizontal , Genoma Bacteriano/genética , Filogenia , Alineación de Secuencia
8.
Syst Biol ; 71(3): 706-720, 2022 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-34605924

RESUMEN

Phylogenetic networks provide a powerful framework for modeling and analyzing reticulate evolutionary histories. While polyploidy has been shown to be prevalent not only in plants but also in other groups of eukaryotic species, most work done thus far on phylogenetic network inference assumes diploid hybridization. These inference methods have been applied, with varying degrees of success, to data sets with polyploid species, even though polyploidy violates the mathematical assumptions underlying these methods. Statistical methods were developed recently for handling specific types of polyploids and so were parsimony methods that could handle polyploidy more generally yet while excluding processes such as incomplete lineage sorting. In this article, we introduce a new method for inferring most parsimonious phylogenetic networks on data that include polyploid species. Taking gene tree topologies as input, the method seeks a phylogenetic network that minimizes deep coalescences while accounting for polyploidy. We demonstrate the performance of the method on both simulated and biological data. The inference method as well as a method for evaluating evolutionary hypotheses in the form of phylogenetic networks are implemented and publicly available in the PhyloNet software package. [Incomplete lineage sorting; minimizing deep coalescences; multilabeled trees; multispecies network coalescent; phylogenetic networks; polyploidy.].


Asunto(s)
Hibridación Genética , Poliploidía , Evolución Biológica , Humanos , Filogenia
9.
PLoS Genet ; 17(8): e1009701, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34407067

RESUMEN

Trait evolution among a set of species-a central theme in evolutionary biology-has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait's evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.


Asunto(s)
Biología Computacional/métodos , Introgresión Genética/genética , Sitios de Carácter Cuantitativo , Evolución Molecular , Especiación Genética , Modelos Genéticos , Fenotipo , Filogenia
10.
Genome Res ; 31(11): 2107-2119, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34426513

RESUMEN

Coalescent methods are proven and powerful tools for population genetics, phylogenetics, epidemiology, and other fields. A promising avenue for the analysis of large genomic alignments, which are increasingly common, is coalescent hidden Markov model (coalHMM) methods, but these methods have lacked general usability and flexibility. We introduce a novel method for automatically learning a coalHMM and inferring the posterior distributions of evolutionary parameters using black-box variational inference, with the transition rates between local genealogies derived empirically by simulation. This derivation enables our method to work directly with three or four taxa and through a divide-and-conquer approach with more taxa. Using a simulated data set resembling a human-chimp-gorilla scenario, we show that our method has comparable or better accuracy to previous coalHMM methods. Both species divergence times and population sizes were accurately inferred. The method also infers local genealogies, and we report on their accuracy. Furthermore, we discuss a potential direction for scaling the method to larger data sets through a divide-and-conquer approach. This accuracy means our method is useful now, and by deriving transition rates by simulation, it is flexible enough to enable future implementations of various population models.


Asunto(s)
Genética de Población , Modelos Genéticos , Animales , Simulación por Computador , Humanos , Densidad de Población , Recombinación Genética
11.
Syst Biol ; 71(1): 208-220, 2021 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-34228807

RESUMEN

Evolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth-death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events. We quantify known inaccuracy issues with divergence time estimates using the popular strategy of concatenating molecular alignments and show that the FBD-MSC solves them. Our new integrative method and empirical results advance the paradigm and practice of probabilistic total evidence analyses in evolutionary biology.[Caninae; fossilized birth-death; molecular clock; multispecies coalescent; phylogenetics; species trees.].


Asunto(s)
Especiación Genética , Modelos Biológicos , Evolución Biológica , Fósiles , Filogenia
12.
Genome Res ; 31(4): 635-644, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33602693

RESUMEN

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Asunto(s)
COVID-19/diagnóstico , COVID-19/transmisión , Variación Genética , Genoma Viral , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , SARS-CoV-2/genética , COVID-19/virología , Interacciones Huésped-Patógeno , Humanos , Polimorfismo de Nucleótido Simple
13.
Mol Biol Evol ; 37(6): 1809-1818, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32077947

RESUMEN

Species tree inference from multilocus data has emerged as a powerful paradigm in the postgenomic era, both in terms of the accuracy of the species tree it produces as well as in terms of elucidating the processes that shaped the evolutionary history. Bayesian methods for species tree inference are desirable in this area as they have been shown not only to yield accurate estimates, but also to naturally provide measures of confidence in those estimates. However, the heavy computational requirements of Bayesian inference have limited the applicability of such methods to very small data sets. In this article, we show that the computational efficiency of Bayesian inference under the multispecies coalescent can be improved in practice by restricting the space of the gene trees explored during the random walk, without sacrificing accuracy as measured by various metrics. The idea is to first infer constraints on the trees of the individual loci in the form of unresolved gene trees, and then to restrict the sampler to consider only resolutions of the constrained trees. We demonstrate the improvements gained by such an approach on both simulated and biological data.


Asunto(s)
Modelos Genéticos , Filogenia , Teorema de Bayes , Cadenas de Markov , Método de Montecarlo
14.
Planta ; 250(6): 1941-1953, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31529398

RESUMEN

MAIN CONCLUSION: Unlike rosette leaves, the mature Arabidopsis rosette core can display full resistance to Botrytis cinerea revealing the importance for spatial and developmental aspects of plant fungal resistance. Arabidopsis thaliana is a model host to investigate plant defense against fungi. However, many of the reports investigating Arabidopsis fungal defense against the necrotrophic fungus, Botrytis cinerea, utilize rosette leaves as host tissue. Here we report organ-dependent differences in B. cinerea resistance of Arabidopsis. Although wild-type Arabidopsis rosette leaves mount a jasmonate-dependent defense that slows fungal growth, this defense is incapable of resisting fungal devastation. In contrast, as the fungus spreads through infected leaf petioles towards the plant center, or rosette core, there is a jasmonate- and age-dependent fungal penetration blockage into the rosette core. We report evidence for induced and preformed resistance in the rosette core, as direct rosette core inoculation can also result in resistance, but at a lower penetrance relative to infections that approach the core from infected leaf petioles. The Arabidopsis rosette core displays a distinct transcriptome relative to other plant organs, and BLADE ON PETIOLE (BOP) transcripts are abundant in the rosette core. The BOP genes, with known roles in abscission zone formation, are required for full Arabidopsis rosette core B. cinerea resistance, suggesting a possible role for BOP-dependent modifications that may help to restrict fungal susceptibility of the rosette core. Finally, we demonstrate that cabbage and cauliflower, common Brassicaceae crops, also display leaf susceptibility and rosette core resistance to B. cinerea that can involve leaf abscission. Thus, spatial and developmental aspects of plant host resistance play critical roles in resistance to necrotrophic fungal pathogens and are important to our understanding of plant defense mechanisms.


Asunto(s)
Arabidopsis/inmunología , Resistencia a la Enfermedad , Enfermedades de las Plantas/microbiología , Hojas de la Planta/microbiología , Arabidopsis/microbiología , Arabidopsis/fisiología , Botrytis , Perfilación de la Expresión Génica , Enfermedades de las Plantas/inmunología , Reguladores del Crecimiento de las Plantas/metabolismo , Hojas de la Planta/inmunología , Reacción en Cadena en Tiempo Real de la Polimerasa
15.
Bioinformatics ; 35(14): i370-i378, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510688

RESUMEN

MOTIVATION: Reticulate evolutionary histories, such as those arising in the presence of hybridization, are best modeled as phylogenetic networks. Recently developed methods allow for statistical inference of phylogenetic networks while also accounting for other processes, such as incomplete lineage sorting. However, these methods can only handle a small number of loci from a handful of genomes. RESULTS: In this article, we introduce a novel two-step method for scalable inference of phylogenetic networks from the sequence alignments of multiple, unlinked loci. The method infers networks on subproblems and then merges them into a network on the full set of taxa. To reduce the number of trinets to infer, we formulate a Hitting Set version of the problem of finding a small number of subsets, and implement a simple heuristic to solve it. We studied their performance, in terms of both running time and accuracy, on simulated as well as on biological datasets. The two-step method accurately infers phylogenetic networks at a scale that is infeasible with existing methods. The results are a significant and promising step towards accurate, large-scale phylogenetic network inference. AVAILABILITY AND IMPLEMENTATION: We implemented the algorithms in the publicly available software package PhyloNet (https://bioinfocs.rice.edu/PhyloNet). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Filogenia , Evolución Molecular , Genoma , Alineación de Secuencia , Programas Informáticos
16.
J Exp Bot ; 70(15): 3955-3967, 2019 08 07.
Artículo en Inglés | MEDLINE | ID: mdl-31056646

RESUMEN

Lateral root (LR) proliferation is a major determinant of soil nutrient uptake. How resource allocation controls the extent of LR growth remains unresolved. We used genetic, physiological, transcriptomic, and grafting approaches to define a role for C-TERMINALLY ENCODED PEPTIDE RECEPTOR 1 (CEPR1) in controlling sucrose-dependent LR growth. CEPR1 inhibited LR growth in response to applied sucrose, other metabolizable sugars, and elevated light intensity. Pathways through CEPR1 restricted LR growth by reducing LR meristem size and the length of mature LR cells. RNA-sequencing of wild-type (WT) and cepr1-1 roots with or without sucrose treatment revealed an intersection of CEP-CEPR1 signalling with the sucrose transcriptional response. Sucrose up-regulated several CEP genes, supporting a specific role for CEP-CEPR1 in the response to sucrose. Moreover, genes with basally perturbed expression in cepr1-1 overlap with WT sucrose-responsive genes significantly. We found that exogenous CEP inhibited LR growth via CEPR1 by reducing LR meristem size and mature cell length. This result is consistent with CEP-CEPR1 acting to curtail the extent of sucrose-dependent LR growth. Reciprocal grafting indicates that LR growth inhibition requires CEPR1 in both the roots and shoots. Our results reveal a new role for CEP-CEPR1 signalling in controlling LR growth in response to sucrose.


Asunto(s)
Proteínas de Arabidopsis/metabolismo , Arabidopsis/crecimiento & desarrollo , Arabidopsis/metabolismo , Raíces de Plantas/crecimiento & desarrollo , Raíces de Plantas/metabolismo , Receptores de Péptidos/metabolismo , Sacarosa/metabolismo , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas/genética , Regulación de la Expresión Génica de las Plantas/fisiología , Meristema/genética , Meristema/crecimiento & desarrollo , Meristema/metabolismo , Raíces de Plantas/genética , Receptores de Péptidos/genética
17.
PLoS Comput Biol ; 15(4): e1006650, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30958812

RESUMEN

Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.


Asunto(s)
Teorema de Bayes , Evolución Biológica , Filogenia , Programas Informáticos , Animales , Biología Computacional , Simulación por Computador , Evolución Molecular , Humanos , Cadenas de Markov , Modelos Genéticos , Método de Montecarlo
18.
Mol Biol Evol ; 35(2): 504-517, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-29220490

RESUMEN

Reticulate species evolution, such as hybridization or introgression, is relatively common in nature. In the presence of reticulation, species relationships can be captured by a rooted phylogenetic network, and orthologous gene evolution can be modeled as bifurcating gene trees embedded in the species network. We present a Bayesian approach to jointly infer species networks and gene trees from multilocus sequence data. A novel birth-hybridization process is used as the prior for the species network, and we assume a multispecies network coalescent prior for the embedded gene trees. We verify the ability of our method to correctly sample from the posterior distribution, and thus to infer a species network, through simulations. To quantify the power of our method, we reanalyze two large data sets of genes from spruces and yeasts. For the three closely related spruces, we verify the previously suggested homoploid hybridization event in this clade; for the yeast data, we find extensive hybridization events. Our method is available within the BEAST 2 add-on SpeciesNetwork, and thus provides an extensible framework for Bayesian inference of reticulate evolution.


Asunto(s)
Técnicas Genéticas , Hibridación Genética , Modelos Genéticos , Filogenia , Teorema de Bayes , Simulación por Computador , Datos de Secuencia Molecular , Picea/genética , Saccharomyces/genética
19.
PeerJ ; 5: e3724, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28875076

RESUMEN

While methods for genetic species delimitation have noticeably improved in the last decade, this remains a work in progress. Ideally, model based approaches should be applied and considered jointly with other lines of evidence, primarily morphology and geography, in an integrative taxonomy framework. Deep phylogeographic divergences have been reported for several species of Carlia skinks, but only for some eastern taxa have species boundaries been formally tested. The present study does this and revises the taxonomy for two species from northern Australia, Carlia johnstonei and C. triacantha. We introduce an approach that is based on the recently published method StarBEAST2, which uses multilocus data to explore the support for alternative species delimitation hypotheses using Bayes Factors (BFD). We apply this method, jointly with two other multispecies coalescent methods, using an extensive (from 2,163 exons) data set along with measures of 11 morphological characters. We use this integrated approach to evaluate two new candidate species previously revealed in phylogeographic analyses of rainbow skinks (genus Carlia) in Western Australia. The results based on BFD StarBEAST2, BFD* SNAPP and BPP genetic delimitation, together with morphology, support each of the four recently identified Carlia lineages as separate species. The BFD StarBEAST2 approach yielded results highly congruent with those from BFD* SNAPP and BPP. This supports use of the robust multilocus multispecies coalescent StarBEAST2 method for species delimitation, which does not require a priori resolved species or gene trees. Compared to the situation in C. triacantha, morphological divergence was greater between the two lineages within Kimberley endemic C. johnstonei, which also had deeper divergent histories. This congruence supports recognition of two species within C. johnstonei. Nevertheless, the combined evidence also supports recognition of two taxa within the more widespread C. triacantha. With this work, we describe two new species, Carlia insularis sp. nov and Carlia isostriacantha sp. nov. in the northwest of Australia. This contributes to increasing recognition that this region of tropical Australia has a rich and unique fauna.

20.
Mol Biol Evol ; 34(8): 2101-2114, 2017 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-28431121

RESUMEN

Fully Bayesian multispecies coalescent (MSC) methods like *BEAST estimate species trees from multiple sequence alignments. Today thousands of genes can be sequenced for a given study, but using that many genes with *BEAST is intractably slow. An alternative is to use heuristic methods which compromise accuracy or completeness in return for speed. A common heuristic is concatenation, which assumes that the evolutionary history of each gene tree is identical to the species tree. This is an inconsistent estimator of species tree topology, a worse estimator of divergence times, and induces spurious substitution rate variation when incomplete lineage sorting is present. Another class of heuristics directly motivated by the MSC avoids many of the pitfalls of concatenation but cannot be used to estimate divergence times. To enable fuller use of available data and more accurate inference of species tree topologies, divergence times, and substitution rates, we have developed a new version of *BEAST called StarBEAST2. To improve convergence rates we add analytical integration of population sizes, novel MCMC operators and other optimizations. Computational performance improved by 13.5× and 13.8× respectively when analyzing two empirical data sets, and an average of 33.1× across 30 simulated data sets. To enable accurate estimates of per-species substitution rates, we introduce species tree relaxed clocks, and show that StarBEAST2 is a more powerful and robust estimator of rate variation than concatenation. StarBEAST2 is available through the BEAUTi package manager in BEAST 2.4 and above.


Asunto(s)
Alineación de Secuencia/métodos , Secuencia de Bases , Teorema de Bayes , Evolución Biológica , Simulación por Computador , Especiación Genética , Modelos Genéticos , Tasa de Mutación , Filogenia , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...