Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Nature ; 629(8013): 851-860, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38560995

RESUMEN

Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method and the choice of genomic regions1-3. Here we address these issues by analysing the genomes of 363 bird species4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a marked degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous-Palaeogene boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that are a challenge to model due to either extreme DNA composition, variable substitution rates, incomplete lineage sorting or complex evolutionary events such as ancient hybridization. Assessment of the effects of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates and relative brain size following the Cretaceous-Palaeogene extinction event, supporting the hypothesis that emerging ecological opportunities catalysed the diversification of modern birds. The resulting phylogenetic estimate offers fresh insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.


Asunto(s)
Aves , Evolución Molecular , Genoma , Filogenia , Animales , Aves/genética , Aves/clasificación , Aves/anatomía & histología , Encéfalo/anatomía & histología , Extinción Biológica , Genoma/genética , Genómica , Densidad de Población , Masculino , Femenino
2.
Genome Res ; 31(11): 2107-2119, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34426513

RESUMEN

Coalescent methods are proven and powerful tools for population genetics, phylogenetics, epidemiology, and other fields. A promising avenue for the analysis of large genomic alignments, which are increasingly common, is coalescent hidden Markov model (coalHMM) methods, but these methods have lacked general usability and flexibility. We introduce a novel method for automatically learning a coalHMM and inferring the posterior distributions of evolutionary parameters using black-box variational inference, with the transition rates between local genealogies derived empirically by simulation. This derivation enables our method to work directly with three or four taxa and through a divide-and-conquer approach with more taxa. Using a simulated data set resembling a human-chimp-gorilla scenario, we show that our method has comparable or better accuracy to previous coalHMM methods. Both species divergence times and population sizes were accurately inferred. The method also infers local genealogies, and we report on their accuracy. Furthermore, we discuss a potential direction for scaling the method to larger data sets through a divide-and-conquer approach. This accuracy means our method is useful now, and by deriving transition rates by simulation, it is flexible enough to enable future implementations of various population models.


Asunto(s)
Genética de Población , Modelos Genéticos , Animales , Simulación por Computador , Humanos , Densidad de Población , Recombinación Genética
3.
Genome Res ; 31(4): 635-644, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33602693

RESUMEN

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Asunto(s)
COVID-19/diagnóstico , COVID-19/transmisión , Variación Genética , Genoma Viral , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , SARS-CoV-2/genética , COVID-19/virología , Interacciones Huésped-Patógeno , Humanos , Polimorfismo de Nucleótido Simple
4.
PLoS Genet ; 17(8): e1009701, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34407067

RESUMEN

Trait evolution among a set of species-a central theme in evolutionary biology-has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait's evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.


Asunto(s)
Biología Computacional/métodos , Introgresión Genética/genética , Sitios de Carácter Cuantitativo , Evolución Molecular , Especiación Genética , Modelos Genéticos , Fenotipo , Filogenia
5.
Bioinformatics ; 38(Suppl 1): i195-i202, 2022 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-35758771

RESUMEN

MOTIVATION: Single-nucleotide variants (SNVs) are the most common variations in the human genome. Recently developed methods for SNV detection from single-cell DNA sequencing data, such as SCIΦ and scVILP, leverage the evolutionary history of the cells to overcome the technical errors associated with single-cell sequencing protocols. Despite being accurate, these methods are not scalable to the extensive genomic breadth of single-cell whole-genome (scWGS) and whole-exome sequencing (scWES) data. RESULTS: Here, we report on a new scalable method, Phylovar, which extends the phylogeny-guided variant calling approach to sequencing datasets containing millions of loci. Through benchmarking on simulated datasets under different settings, we show that, Phylovar outperforms SCIΦ in terms of running time while being more accurate than Monovar (which is not phylogeny-aware) in terms of SNV detection. Furthermore, we applied Phylovar to two real biological datasets: an scWES triple-negative breast cancer data consisting of 32 cells and 3375 loci as well as an scWGS data of neuron cells from a normal human brain containing 16 cells and approximately 2.5 million loci. For the cancer data, Phylovar detected somatic SNVs with high or moderate functional impact that were also supported by bulk sequencing dataset and for the neuron dataset, Phylovar identified 5745 SNVs with non-synonymous effects some of which were associated with neurodegenerative diseases. AVAILABILITY AND IMPLEMENTATION: Phylovar is implemented in Python and is publicly available at https://github.com/NakhlehLab/Phylovar.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Nucleótidos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Filogenia , Análisis de Secuencia de ADN
6.
Mol Phylogenet Evol ; 181: 107724, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36720421

RESUMEN

Accurate inference of population parameters plays a pivotal role in unravelling evolutionary histories. While recombination has been universally accepted as a fundamental process in the evolution of sexually reproducing organisms, it remains challenging to model it exactly. Thus, existing coalescent-based approaches make different assumptions or approximations to facilitate phylogenetic inference, which can potentially bring about biases in estimates of evolutionary parameters when recombination is present. In this article, we evaluate the performance of population parameter estimation using three methods-StarBEAST2, SNAPP, and diCal2-that represent three different types of inference. We performed whole-genome simulations in which recombination rates, mutation rates, and levels of incomplete lineage sorting were varied. We show that StarBEAST2 using short or medium-sized loci is robust to realistic rates of recombination, which is in agreement with previous studies. SNAPP, as expected, is generally unaffected by recombination events. Most surprisingly, diCal2, a method that is designed to explicitly account for recombination, performs considerably worse than other methods under comparison.


Asunto(s)
Genoma , Tasa de Mutación , Filogenia , Recombinación Genética , Modelos Genéticos , Simulación por Computador
7.
Syst Biol ; 71(3): 706-720, 2022 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-34605924

RESUMEN

Phylogenetic networks provide a powerful framework for modeling and analyzing reticulate evolutionary histories. While polyploidy has been shown to be prevalent not only in plants but also in other groups of eukaryotic species, most work done thus far on phylogenetic network inference assumes diploid hybridization. These inference methods have been applied, with varying degrees of success, to data sets with polyploid species, even though polyploidy violates the mathematical assumptions underlying these methods. Statistical methods were developed recently for handling specific types of polyploids and so were parsimony methods that could handle polyploidy more generally yet while excluding processes such as incomplete lineage sorting. In this article, we introduce a new method for inferring most parsimonious phylogenetic networks on data that include polyploid species. Taking gene tree topologies as input, the method seeks a phylogenetic network that minimizes deep coalescences while accounting for polyploidy. We demonstrate the performance of the method on both simulated and biological data. The inference method as well as a method for evaluating evolutionary hypotheses in the form of phylogenetic networks are implemented and publicly available in the PhyloNet software package. [Incomplete lineage sorting; minimizing deep coalescences; multilabeled trees; multispecies network coalescent; phylogenetic networks; polyploidy.].


Asunto(s)
Hibridación Genética , Poliploidía , Evolución Biológica , Humanos , Filogenia
8.
PLoS Comput Biol ; 18(6): e1010216, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35675326

RESUMEN

Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences (MHGs) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.


Asunto(s)
Genoma Bacteriano , Células Procariotas , Transferencia de Gen Horizontal , Genoma Bacteriano/genética , Filogenia , Alineación de Secuencia
9.
Syst Biol ; 71(1): 208-220, 2021 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-34228807

RESUMEN

Evolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth-death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events. We quantify known inaccuracy issues with divergence time estimates using the popular strategy of concatenating molecular alignments and show that the FBD-MSC solves them. Our new integrative method and empirical results advance the paradigm and practice of probabilistic total evidence analyses in evolutionary biology.[Caninae; fossilized birth-death; molecular clock; multispecies coalescent; phylogenetics; species trees.].


Asunto(s)
Especiación Genética , Modelos Biológicos , Evolución Biológica , Fósiles , Filogenia
10.
Mol Biol Evol ; 37(6): 1809-1818, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32077947

RESUMEN

Species tree inference from multilocus data has emerged as a powerful paradigm in the postgenomic era, both in terms of the accuracy of the species tree it produces as well as in terms of elucidating the processes that shaped the evolutionary history. Bayesian methods for species tree inference are desirable in this area as they have been shown not only to yield accurate estimates, but also to naturally provide measures of confidence in those estimates. However, the heavy computational requirements of Bayesian inference have limited the applicability of such methods to very small data sets. In this article, we show that the computational efficiency of Bayesian inference under the multispecies coalescent can be improved in practice by restricting the space of the gene trees explored during the random walk, without sacrificing accuracy as measured by various metrics. The idea is to first infer constraints on the trees of the individual loci in the form of unresolved gene trees, and then to restrict the sampler to consider only resolutions of the constrained trees. We demonstrate the improvements gained by such an approach on both simulated and biological data.


Asunto(s)
Modelos Genéticos , Filogenia , Teorema de Bayes , Cadenas de Markov , Método de Montecarlo
11.
Bioinformatics ; 35(14): i370-i378, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510688

RESUMEN

MOTIVATION: Reticulate evolutionary histories, such as those arising in the presence of hybridization, are best modeled as phylogenetic networks. Recently developed methods allow for statistical inference of phylogenetic networks while also accounting for other processes, such as incomplete lineage sorting. However, these methods can only handle a small number of loci from a handful of genomes. RESULTS: In this article, we introduce a novel two-step method for scalable inference of phylogenetic networks from the sequence alignments of multiple, unlinked loci. The method infers networks on subproblems and then merges them into a network on the full set of taxa. To reduce the number of trinets to infer, we formulate a Hitting Set version of the problem of finding a small number of subsets, and implement a simple heuristic to solve it. We studied their performance, in terms of both running time and accuracy, on simulated as well as on biological datasets. The two-step method accurately infers phylogenetic networks at a scale that is infeasible with existing methods. The results are a significant and promising step towards accurate, large-scale phylogenetic network inference. AVAILABILITY AND IMPLEMENTATION: We implemented the algorithms in the publicly available software package PhyloNet (https://bioinfocs.rice.edu/PhyloNet). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Filogenia , Evolución Molecular , Genoma , Alineación de Secuencia , Programas Informáticos
12.
PLoS Comput Biol ; 15(4): e1006650, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30958812

RESUMEN

Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.


Asunto(s)
Teorema de Bayes , Evolución Biológica , Filogenia , Programas Informáticos , Animales , Biología Computacional , Simulación por Computador , Evolución Molecular , Humanos , Cadenas de Markov , Modelos Genéticos , Método de Montecarlo
13.
Mol Biol Evol ; 35(2): 504-517, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-29220490

RESUMEN

Reticulate species evolution, such as hybridization or introgression, is relatively common in nature. In the presence of reticulation, species relationships can be captured by a rooted phylogenetic network, and orthologous gene evolution can be modeled as bifurcating gene trees embedded in the species network. We present a Bayesian approach to jointly infer species networks and gene trees from multilocus sequence data. A novel birth-hybridization process is used as the prior for the species network, and we assume a multispecies network coalescent prior for the embedded gene trees. We verify the ability of our method to correctly sample from the posterior distribution, and thus to infer a species network, through simulations. To quantify the power of our method, we reanalyze two large data sets of genes from spruces and yeasts. For the three closely related spruces, we verify the previously suggested homoploid hybridization event in this clade; for the yeast data, we find extensive hybridization events. Our method is available within the BEAST 2 add-on SpeciesNetwork, and thus provides an extensible framework for Bayesian inference of reticulate evolution.


Asunto(s)
Técnicas Genéticas , Hibridación Genética , Modelos Genéticos , Filogenia , Teorema de Bayes , Simulación por Computador , Datos de Secuencia Molecular , Picea/genética , Saccharomyces/genética
14.
Planta ; 250(6): 1941-1953, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31529398

RESUMEN

MAIN CONCLUSION: Unlike rosette leaves, the mature Arabidopsis rosette core can display full resistance to Botrytis cinerea revealing the importance for spatial and developmental aspects of plant fungal resistance. Arabidopsis thaliana is a model host to investigate plant defense against fungi. However, many of the reports investigating Arabidopsis fungal defense against the necrotrophic fungus, Botrytis cinerea, utilize rosette leaves as host tissue. Here we report organ-dependent differences in B. cinerea resistance of Arabidopsis. Although wild-type Arabidopsis rosette leaves mount a jasmonate-dependent defense that slows fungal growth, this defense is incapable of resisting fungal devastation. In contrast, as the fungus spreads through infected leaf petioles towards the plant center, or rosette core, there is a jasmonate- and age-dependent fungal penetration blockage into the rosette core. We report evidence for induced and preformed resistance in the rosette core, as direct rosette core inoculation can also result in resistance, but at a lower penetrance relative to infections that approach the core from infected leaf petioles. The Arabidopsis rosette core displays a distinct transcriptome relative to other plant organs, and BLADE ON PETIOLE (BOP) transcripts are abundant in the rosette core. The BOP genes, with known roles in abscission zone formation, are required for full Arabidopsis rosette core B. cinerea resistance, suggesting a possible role for BOP-dependent modifications that may help to restrict fungal susceptibility of the rosette core. Finally, we demonstrate that cabbage and cauliflower, common Brassicaceae crops, also display leaf susceptibility and rosette core resistance to B. cinerea that can involve leaf abscission. Thus, spatial and developmental aspects of plant host resistance play critical roles in resistance to necrotrophic fungal pathogens and are important to our understanding of plant defense mechanisms.


Asunto(s)
Arabidopsis/inmunología , Resistencia a la Enfermedad , Enfermedades de las Plantas/microbiología , Hojas de la Planta/microbiología , Arabidopsis/microbiología , Arabidopsis/fisiología , Botrytis , Perfilación de la Expresión Génica , Enfermedades de las Plantas/inmunología , Reguladores del Crecimiento de las Plantas/metabolismo , Hojas de la Planta/inmunología , Reacción en Cadena en Tiempo Real de la Polimerasa
15.
J Exp Bot ; 70(15): 3955-3967, 2019 08 07.
Artículo en Inglés | MEDLINE | ID: mdl-31056646

RESUMEN

Lateral root (LR) proliferation is a major determinant of soil nutrient uptake. How resource allocation controls the extent of LR growth remains unresolved. We used genetic, physiological, transcriptomic, and grafting approaches to define a role for C-TERMINALLY ENCODED PEPTIDE RECEPTOR 1 (CEPR1) in controlling sucrose-dependent LR growth. CEPR1 inhibited LR growth in response to applied sucrose, other metabolizable sugars, and elevated light intensity. Pathways through CEPR1 restricted LR growth by reducing LR meristem size and the length of mature LR cells. RNA-sequencing of wild-type (WT) and cepr1-1 roots with or without sucrose treatment revealed an intersection of CEP-CEPR1 signalling with the sucrose transcriptional response. Sucrose up-regulated several CEP genes, supporting a specific role for CEP-CEPR1 in the response to sucrose. Moreover, genes with basally perturbed expression in cepr1-1 overlap with WT sucrose-responsive genes significantly. We found that exogenous CEP inhibited LR growth via CEPR1 by reducing LR meristem size and mature cell length. This result is consistent with CEP-CEPR1 acting to curtail the extent of sucrose-dependent LR growth. Reciprocal grafting indicates that LR growth inhibition requires CEPR1 in both the roots and shoots. Our results reveal a new role for CEP-CEPR1 signalling in controlling LR growth in response to sucrose.


Asunto(s)
Proteínas de Arabidopsis/metabolismo , Arabidopsis/crecimiento & desarrollo , Arabidopsis/metabolismo , Raíces de Plantas/crecimiento & desarrollo , Raíces de Plantas/metabolismo , Receptores de Péptidos/metabolismo , Sacarosa/metabolismo , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas/genética , Regulación de la Expresión Génica de las Plantas/fisiología , Meristema/genética , Meristema/crecimiento & desarrollo , Meristema/metabolismo , Raíces de Plantas/genética , Receptores de Péptidos/genética
16.
Mol Biol Evol ; 34(8): 2101-2114, 2017 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-28431121

RESUMEN

Fully Bayesian multispecies coalescent (MSC) methods like *BEAST estimate species trees from multiple sequence alignments. Today thousands of genes can be sequenced for a given study, but using that many genes with *BEAST is intractably slow. An alternative is to use heuristic methods which compromise accuracy or completeness in return for speed. A common heuristic is concatenation, which assumes that the evolutionary history of each gene tree is identical to the species tree. This is an inconsistent estimator of species tree topology, a worse estimator of divergence times, and induces spurious substitution rate variation when incomplete lineage sorting is present. Another class of heuristics directly motivated by the MSC avoids many of the pitfalls of concatenation but cannot be used to estimate divergence times. To enable fuller use of available data and more accurate inference of species tree topologies, divergence times, and substitution rates, we have developed a new version of *BEAST called StarBEAST2. To improve convergence rates we add analytical integration of population sizes, novel MCMC operators and other optimizations. Computational performance improved by 13.5× and 13.8× respectively when analyzing two empirical data sets, and an average of 33.1× across 30 simulated data sets. To enable accurate estimates of per-species substitution rates, we introduce species tree relaxed clocks, and show that StarBEAST2 is a more powerful and robust estimator of rate variation than concatenation. StarBEAST2 is available through the BEAUTi package manager in BEAST 2.4 and above.


Asunto(s)
Alineación de Secuencia/métodos , Secuencia de Bases , Teorema de Bayes , Evolución Biológica , Simulación por Computador , Especiación Genética , Modelos Genéticos , Tasa de Mutación , Filogenia , Programas Informáticos
17.
Syst Biol ; 65(3): 381-96, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-26821913

RESUMEN

Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent.


Asunto(s)
Clasificación/métodos , Filogenia , Programas Informáticos , Teorema de Bayes , Evolución Biológica , Interpretación Estadística de Datos , Evolución Molecular , Modelos Genéticos
18.
BMC Genomics ; 15: 870, 2014 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-25287121

RESUMEN

BACKGROUND: Small, secreted signaling peptides work in parallel with phytohormones to control important aspects of plant growth and development. Genes from the C-TERMINALLY ENCODED PEPTIDE (CEP) family produce such peptides which negatively regulate plant growth, especially under stress, and affect other important developmental processes. To illuminate how the CEP gene family has evolved within the plant kingdom, including its emergence, diversification and variation between lineages, a comprehensive survey was undertaken to identify and characterize CEP genes in 106 plant genomes. RESULTS: Using a motif-based system developed for this study to identify canonical CEP peptide domains, a total of 916 CEP genes and 1,223 CEP domains were found in angiosperms and for the first time in gymnosperms. This defines a narrow band for the emergence of CEP genes in plants, from the divergence of lycophytes to the angiosperm/gymnosperm split. Both CEP genes and domains were found to have diversified in angiosperms, particularly in the Poaceae and Solanaceae plant families. Multispecies orthologous relationships were determined for 22% of identified CEP genes, and further analysis of those groups found selective constraints upon residues within the CEP peptide and within the previously little-characterized variable region. An examination of public Oryza sativa RNA-Seq datasets revealed an expression pattern that links OsCEP5 and OsCEP6 to panicle development and flowering, and CEP gene trees reveal these emerged from a duplication event associated with the Poaceae plant family. CONCLUSIONS: The characterization of the plant-family specific CEP genes OsCEP5 and OsCEP6, the association of CEP genes with angiosperm-specific development processes like panicle development, and the diversification of CEP genes in angiosperms provides further support for the hypothesis that CEP genes have been integral to the evolution of novel traits within the angiosperm lineage. Beyond these findings, the comprehensive set of CEP genes and their properties reported here will be a resource for future research on CEP genes and peptides.


Asunto(s)
Evolución Molecular , Genes de Plantas , Magnoliopsida/genética , Proteínas de Plantas/genética , Secuencia de Aminoácidos , Composición de Base , Variación Genética , Datos de Secuencia Molecular , Oryza/genética , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/clasificación
19.
Cancer Discov ; 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38943574

RESUMEN

Tumors frequently display high chromosomal instability and contain multiple copies of genomic regions. Here, we describe GRITIC, a generic method for timing genomic gains leading to complex copy number states, using single-sample bulk whole-genome sequencing data. By applying GRITIC to 6,091 tumors, we found that non-parsimonious evolution is frequent in the formation of complex copy number states in genome-doubled tumors. We measured chromosomal instability before and after genome duplication in human tumors and found that late genome doubling was followed by an increase in the rate of copy number gain. Copy number gains often accumulate as punctuated bursts, commonly after genome doubling. We infer that genome duplications typically affect the landscape of copy number losses, while only minimally impacting copy number gains. In summary, GRITIC is a novel copy number gain timing framework that permits the analysis of copy number evolution in chromosomally unstable tumors.

20.
Planta ; 238(1): 91-105, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23572382

RESUMEN

Plant root architecture is regulated by the initiation and modulation of cell division in regions containing pluripotent stem cells known as meristems. In roots, meristems are formed early in embryogenesis, in the case of the root apical meristem (RAM), and during organogenesis at the site of lateral root or, in legumes, nodule formation. Root meristems can also be generated in vitro from leaf explants cultures supplemented with auxin. microRNAs (miRNAs) have emerged as regulators of many key biological functions in plants including root development. To identify key miRNAs involved in root meristem formation in Medicago truncatula, we used deep sequencing to compare miRNA populations. Comparisons were made between: (1) the root tip (RT), containing the RAM and the elongation zone (EZ) tissue and (2) root forming callus (RFC) and non-root forming callus (NRFC). We identified 83 previously reported miRNAs, 24 new to M. truncatula, in 44 families. For the first time in M. truncatula, members of conserved miRNA families miR165, miR181 and miR397 were found. Bioinformatic analysis identified 38 potential novel miRNAs. Selected miRNAs and targets were validated using Taqman miRNA assays and 5' RACE. Many miRNAs were differentially expressed between tissues, particularly RFC and NRFC. Target prediction revealed a number of miRNAs to target genes previously shown to be differentially expressed between RT and EZ or RFC and NRFC and important in root development. Additionally, we predict the miRNA/target relationships for miR397 and miR160 to be conserved in M. truncatula. Amongst the predictions, were AUXIN RESPONSE FACTOR 10, targeted by miR160 and a LACCASE-like gene, targeted by miR397, both are miRNA/target pairings conserved in other species.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Medicago truncatula/genética , MicroARNs/genética , Raíces de Plantas/crecimiento & desarrollo , Raíces de Plantas/genética , Secuencia de Bases , Secuencia Conservada , Regulación de la Expresión Génica de las Plantas , Secuenciación de Nucleótidos de Alto Rendimiento , Medicago truncatula/crecimiento & desarrollo , Meristema/genética , Reproducibilidad de los Resultados , Técnicas de Cultivo de Tejidos , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA