Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
Nature ; 629(8013): 851-860, 2024 May.
Article in English | MEDLINE | ID: mdl-38560995

ABSTRACT

Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method and the choice of genomic regions1-3. Here we address these issues by analysing the genomes of 363 bird species4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a marked degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous-Palaeogene boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that are a challenge to model due to either extreme DNA composition, variable substitution rates, incomplete lineage sorting or complex evolutionary events such as ancient hybridization. Assessment of the effects of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates and relative brain size following the Cretaceous-Palaeogene extinction event, supporting the hypothesis that emerging ecological opportunities catalysed the diversification of modern birds. The resulting phylogenetic estimate offers fresh insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.


Subject(s)
Birds , Evolution, Molecular , Genome , Phylogeny , Animals , Birds/genetics , Birds/classification , Birds/anatomy & histology , Brain/anatomy & histology , Extinction, Biological , Genome/genetics , Genomics , Population Density , Male , Female
2.
Genome Res ; 31(11): 2107-2119, 2021 11.
Article in English | MEDLINE | ID: mdl-34426513

ABSTRACT

Coalescent methods are proven and powerful tools for population genetics, phylogenetics, epidemiology, and other fields. A promising avenue for the analysis of large genomic alignments, which are increasingly common, is coalescent hidden Markov model (coalHMM) methods, but these methods have lacked general usability and flexibility. We introduce a novel method for automatically learning a coalHMM and inferring the posterior distributions of evolutionary parameters using black-box variational inference, with the transition rates between local genealogies derived empirically by simulation. This derivation enables our method to work directly with three or four taxa and through a divide-and-conquer approach with more taxa. Using a simulated data set resembling a human-chimp-gorilla scenario, we show that our method has comparable or better accuracy to previous coalHMM methods. Both species divergence times and population sizes were accurately inferred. The method also infers local genealogies, and we report on their accuracy. Furthermore, we discuss a potential direction for scaling the method to larger data sets through a divide-and-conquer approach. This accuracy means our method is useful now, and by deriving transition rates by simulation, it is flexible enough to enable future implementations of various population models.


Subject(s)
Genetics, Population , Models, Genetic , Animals , Computer Simulation , Humans , Population Density , Recombination, Genetic
3.
Genome Res ; 31(4): 635-644, 2021 04.
Article in English | MEDLINE | ID: mdl-33602693

ABSTRACT

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Subject(s)
COVID-19/diagnosis , COVID-19/transmission , Genetic Variation , Genome, Viral , Real-Time Polymerase Chain Reaction/methods , SARS-CoV-2/genetics , COVID-19/virology , Host-Pathogen Interactions , Humans , Polymorphism, Single Nucleotide
4.
PLoS Genet ; 17(8): e1009701, 2021 08.
Article in English | MEDLINE | ID: mdl-34407067

ABSTRACT

Trait evolution among a set of species-a central theme in evolutionary biology-has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait's evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.


Subject(s)
Computational Biology/methods , Genetic Introgression/genetics , Quantitative Trait Loci , Evolution, Molecular , Genetic Speciation , Models, Genetic , Phenotype , Phylogeny
5.
Bioinformatics ; 38(Suppl 1): i195-i202, 2022 06 24.
Article in English | MEDLINE | ID: mdl-35758771

ABSTRACT

MOTIVATION: Single-nucleotide variants (SNVs) are the most common variations in the human genome. Recently developed methods for SNV detection from single-cell DNA sequencing data, such as SCIΦ and scVILP, leverage the evolutionary history of the cells to overcome the technical errors associated with single-cell sequencing protocols. Despite being accurate, these methods are not scalable to the extensive genomic breadth of single-cell whole-genome (scWGS) and whole-exome sequencing (scWES) data. RESULTS: Here, we report on a new scalable method, Phylovar, which extends the phylogeny-guided variant calling approach to sequencing datasets containing millions of loci. Through benchmarking on simulated datasets under different settings, we show that, Phylovar outperforms SCIΦ in terms of running time while being more accurate than Monovar (which is not phylogeny-aware) in terms of SNV detection. Furthermore, we applied Phylovar to two real biological datasets: an scWES triple-negative breast cancer data consisting of 32 cells and 3375 loci as well as an scWGS data of neuron cells from a normal human brain containing 16 cells and approximately 2.5 million loci. For the cancer data, Phylovar detected somatic SNVs with high or moderate functional impact that were also supported by bulk sequencing dataset and for the neuron dataset, Phylovar identified 5745 SNVs with non-synonymous effects some of which were associated with neurodegenerative diseases. AVAILABILITY AND IMPLEMENTATION: Phylovar is implemented in Python and is publicly available at https://github.com/NakhlehLab/Phylovar.


Subject(s)
High-Throughput Nucleotide Sequencing , Nucleotides , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Humans , Phylogeny , Sequence Analysis, DNA
6.
Mol Phylogenet Evol ; 181: 107724, 2023 04.
Article in English | MEDLINE | ID: mdl-36720421

ABSTRACT

Accurate inference of population parameters plays a pivotal role in unravelling evolutionary histories. While recombination has been universally accepted as a fundamental process in the evolution of sexually reproducing organisms, it remains challenging to model it exactly. Thus, existing coalescent-based approaches make different assumptions or approximations to facilitate phylogenetic inference, which can potentially bring about biases in estimates of evolutionary parameters when recombination is present. In this article, we evaluate the performance of population parameter estimation using three methods-StarBEAST2, SNAPP, and diCal2-that represent three different types of inference. We performed whole-genome simulations in which recombination rates, mutation rates, and levels of incomplete lineage sorting were varied. We show that StarBEAST2 using short or medium-sized loci is robust to realistic rates of recombination, which is in agreement with previous studies. SNAPP, as expected, is generally unaffected by recombination events. Most surprisingly, diCal2, a method that is designed to explicitly account for recombination, performs considerably worse than other methods under comparison.


Subject(s)
Genome , Mutation Rate , Phylogeny , Recombination, Genetic , Models, Genetic , Computer Simulation
7.
Syst Biol ; 71(3): 706-720, 2022 04 19.
Article in English | MEDLINE | ID: mdl-34605924

ABSTRACT

Phylogenetic networks provide a powerful framework for modeling and analyzing reticulate evolutionary histories. While polyploidy has been shown to be prevalent not only in plants but also in other groups of eukaryotic species, most work done thus far on phylogenetic network inference assumes diploid hybridization. These inference methods have been applied, with varying degrees of success, to data sets with polyploid species, even though polyploidy violates the mathematical assumptions underlying these methods. Statistical methods were developed recently for handling specific types of polyploids and so were parsimony methods that could handle polyploidy more generally yet while excluding processes such as incomplete lineage sorting. In this article, we introduce a new method for inferring most parsimonious phylogenetic networks on data that include polyploid species. Taking gene tree topologies as input, the method seeks a phylogenetic network that minimizes deep coalescences while accounting for polyploidy. We demonstrate the performance of the method on both simulated and biological data. The inference method as well as a method for evaluating evolutionary hypotheses in the form of phylogenetic networks are implemented and publicly available in the PhyloNet software package. [Incomplete lineage sorting; minimizing deep coalescences; multilabeled trees; multispecies network coalescent; phylogenetic networks; polyploidy.].


Subject(s)
Hybridization, Genetic , Polyploidy , Biological Evolution , Humans , Phylogeny
8.
PLoS Comput Biol ; 18(6): e1010216, 2022 06.
Article in English | MEDLINE | ID: mdl-35675326

ABSTRACT

Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences (MHGs) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.


Subject(s)
Genome, Bacterial , Prokaryotic Cells , Gene Transfer, Horizontal , Genome, Bacterial/genetics , Phylogeny , Sequence Alignment
9.
Syst Biol ; 71(1): 208-220, 2021 12 16.
Article in English | MEDLINE | ID: mdl-34228807

ABSTRACT

Evolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth-death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events. We quantify known inaccuracy issues with divergence time estimates using the popular strategy of concatenating molecular alignments and show that the FBD-MSC solves them. Our new integrative method and empirical results advance the paradigm and practice of probabilistic total evidence analyses in evolutionary biology.[Caninae; fossilized birth-death; molecular clock; multispecies coalescent; phylogenetics; species trees.].


Subject(s)
Genetic Speciation , Models, Biological , Biological Evolution , Fossils , Phylogeny
10.
Mol Biol Evol ; 37(6): 1809-1818, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32077947

ABSTRACT

Species tree inference from multilocus data has emerged as a powerful paradigm in the postgenomic era, both in terms of the accuracy of the species tree it produces as well as in terms of elucidating the processes that shaped the evolutionary history. Bayesian methods for species tree inference are desirable in this area as they have been shown not only to yield accurate estimates, but also to naturally provide measures of confidence in those estimates. However, the heavy computational requirements of Bayesian inference have limited the applicability of such methods to very small data sets. In this article, we show that the computational efficiency of Bayesian inference under the multispecies coalescent can be improved in practice by restricting the space of the gene trees explored during the random walk, without sacrificing accuracy as measured by various metrics. The idea is to first infer constraints on the trees of the individual loci in the form of unresolved gene trees, and then to restrict the sampler to consider only resolutions of the constrained trees. We demonstrate the improvements gained by such an approach on both simulated and biological data.


Subject(s)
Models, Genetic , Phylogeny , Bayes Theorem , Markov Chains , Monte Carlo Method
11.
Bioinformatics ; 35(14): i370-i378, 2019 07 15.
Article in English | MEDLINE | ID: mdl-31510688

ABSTRACT

MOTIVATION: Reticulate evolutionary histories, such as those arising in the presence of hybridization, are best modeled as phylogenetic networks. Recently developed methods allow for statistical inference of phylogenetic networks while also accounting for other processes, such as incomplete lineage sorting. However, these methods can only handle a small number of loci from a handful of genomes. RESULTS: In this article, we introduce a novel two-step method for scalable inference of phylogenetic networks from the sequence alignments of multiple, unlinked loci. The method infers networks on subproblems and then merges them into a network on the full set of taxa. To reduce the number of trinets to infer, we formulate a Hitting Set version of the problem of finding a small number of subsets, and implement a simple heuristic to solve it. We studied their performance, in terms of both running time and accuracy, on simulated as well as on biological datasets. The two-step method accurately infers phylogenetic networks at a scale that is infeasible with existing methods. The results are a significant and promising step towards accurate, large-scale phylogenetic network inference. AVAILABILITY AND IMPLEMENTATION: We implemented the algorithms in the publicly available software package PhyloNet (https://bioinfocs.rice.edu/PhyloNet). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Phylogeny , Evolution, Molecular , Genome , Sequence Alignment , Software
12.
PLoS Comput Biol ; 15(4): e1006650, 2019 04.
Article in English | MEDLINE | ID: mdl-30958812

ABSTRACT

Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.


Subject(s)
Bayes Theorem , Biological Evolution , Phylogeny , Software , Animals , Computational Biology , Computer Simulation , Evolution, Molecular , Humans , Markov Chains , Models, Genetic , Monte Carlo Method
13.
Mol Biol Evol ; 35(2): 504-517, 2018 02 01.
Article in English | MEDLINE | ID: mdl-29220490

ABSTRACT

Reticulate species evolution, such as hybridization or introgression, is relatively common in nature. In the presence of reticulation, species relationships can be captured by a rooted phylogenetic network, and orthologous gene evolution can be modeled as bifurcating gene trees embedded in the species network. We present a Bayesian approach to jointly infer species networks and gene trees from multilocus sequence data. A novel birth-hybridization process is used as the prior for the species network, and we assume a multispecies network coalescent prior for the embedded gene trees. We verify the ability of our method to correctly sample from the posterior distribution, and thus to infer a species network, through simulations. To quantify the power of our method, we reanalyze two large data sets of genes from spruces and yeasts. For the three closely related spruces, we verify the previously suggested homoploid hybridization event in this clade; for the yeast data, we find extensive hybridization events. Our method is available within the BEAST 2 add-on SpeciesNetwork, and thus provides an extensible framework for Bayesian inference of reticulate evolution.


Subject(s)
Genetic Techniques , Hybridization, Genetic , Models, Genetic , Phylogeny , Bayes Theorem , Computer Simulation , Molecular Sequence Data , Picea/genetics , Saccharomyces/genetics
14.
Planta ; 250(6): 1941-1953, 2019 Dec.
Article in English | MEDLINE | ID: mdl-31529398

ABSTRACT

MAIN CONCLUSION: Unlike rosette leaves, the mature Arabidopsis rosette core can display full resistance to Botrytis cinerea revealing the importance for spatial and developmental aspects of plant fungal resistance. Arabidopsis thaliana is a model host to investigate plant defense against fungi. However, many of the reports investigating Arabidopsis fungal defense against the necrotrophic fungus, Botrytis cinerea, utilize rosette leaves as host tissue. Here we report organ-dependent differences in B. cinerea resistance of Arabidopsis. Although wild-type Arabidopsis rosette leaves mount a jasmonate-dependent defense that slows fungal growth, this defense is incapable of resisting fungal devastation. In contrast, as the fungus spreads through infected leaf petioles towards the plant center, or rosette core, there is a jasmonate- and age-dependent fungal penetration blockage into the rosette core. We report evidence for induced and preformed resistance in the rosette core, as direct rosette core inoculation can also result in resistance, but at a lower penetrance relative to infections that approach the core from infected leaf petioles. The Arabidopsis rosette core displays a distinct transcriptome relative to other plant organs, and BLADE ON PETIOLE (BOP) transcripts are abundant in the rosette core. The BOP genes, with known roles in abscission zone formation, are required for full Arabidopsis rosette core B. cinerea resistance, suggesting a possible role for BOP-dependent modifications that may help to restrict fungal susceptibility of the rosette core. Finally, we demonstrate that cabbage and cauliflower, common Brassicaceae crops, also display leaf susceptibility and rosette core resistance to B. cinerea that can involve leaf abscission. Thus, spatial and developmental aspects of plant host resistance play critical roles in resistance to necrotrophic fungal pathogens and are important to our understanding of plant defense mechanisms.


Subject(s)
Arabidopsis/immunology , Disease Resistance , Plant Diseases/microbiology , Plant Leaves/microbiology , Arabidopsis/microbiology , Arabidopsis/physiology , Botrytis , Gene Expression Profiling , Plant Diseases/immunology , Plant Growth Regulators/metabolism , Plant Leaves/immunology , Real-Time Polymerase Chain Reaction
15.
J Exp Bot ; 70(15): 3955-3967, 2019 08 07.
Article in English | MEDLINE | ID: mdl-31056646

ABSTRACT

Lateral root (LR) proliferation is a major determinant of soil nutrient uptake. How resource allocation controls the extent of LR growth remains unresolved. We used genetic, physiological, transcriptomic, and grafting approaches to define a role for C-TERMINALLY ENCODED PEPTIDE RECEPTOR 1 (CEPR1) in controlling sucrose-dependent LR growth. CEPR1 inhibited LR growth in response to applied sucrose, other metabolizable sugars, and elevated light intensity. Pathways through CEPR1 restricted LR growth by reducing LR meristem size and the length of mature LR cells. RNA-sequencing of wild-type (WT) and cepr1-1 roots with or without sucrose treatment revealed an intersection of CEP-CEPR1 signalling with the sucrose transcriptional response. Sucrose up-regulated several CEP genes, supporting a specific role for CEP-CEPR1 in the response to sucrose. Moreover, genes with basally perturbed expression in cepr1-1 overlap with WT sucrose-responsive genes significantly. We found that exogenous CEP inhibited LR growth via CEPR1 by reducing LR meristem size and mature cell length. This result is consistent with CEP-CEPR1 acting to curtail the extent of sucrose-dependent LR growth. Reciprocal grafting indicates that LR growth inhibition requires CEPR1 in both the roots and shoots. Our results reveal a new role for CEP-CEPR1 signalling in controlling LR growth in response to sucrose.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/growth & development , Arabidopsis/metabolism , Plant Roots/growth & development , Plant Roots/metabolism , Receptors, Peptide/metabolism , Sucrose/metabolism , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Gene Expression Regulation, Plant/genetics , Gene Expression Regulation, Plant/physiology , Meristem/genetics , Meristem/growth & development , Meristem/metabolism , Plant Roots/genetics , Receptors, Peptide/genetics
16.
Mol Biol Evol ; 34(8): 2101-2114, 2017 08 01.
Article in English | MEDLINE | ID: mdl-28431121

ABSTRACT

Fully Bayesian multispecies coalescent (MSC) methods like *BEAST estimate species trees from multiple sequence alignments. Today thousands of genes can be sequenced for a given study, but using that many genes with *BEAST is intractably slow. An alternative is to use heuristic methods which compromise accuracy or completeness in return for speed. A common heuristic is concatenation, which assumes that the evolutionary history of each gene tree is identical to the species tree. This is an inconsistent estimator of species tree topology, a worse estimator of divergence times, and induces spurious substitution rate variation when incomplete lineage sorting is present. Another class of heuristics directly motivated by the MSC avoids many of the pitfalls of concatenation but cannot be used to estimate divergence times. To enable fuller use of available data and more accurate inference of species tree topologies, divergence times, and substitution rates, we have developed a new version of *BEAST called StarBEAST2. To improve convergence rates we add analytical integration of population sizes, novel MCMC operators and other optimizations. Computational performance improved by 13.5× and 13.8× respectively when analyzing two empirical data sets, and an average of 33.1× across 30 simulated data sets. To enable accurate estimates of per-species substitution rates, we introduce species tree relaxed clocks, and show that StarBEAST2 is a more powerful and robust estimator of rate variation than concatenation. StarBEAST2 is available through the BEAUTi package manager in BEAST 2.4 and above.


Subject(s)
Sequence Alignment/methods , Base Sequence , Bayes Theorem , Biological Evolution , Computer Simulation , Genetic Speciation , Models, Genetic , Mutation Rate , Phylogeny , Software
17.
Syst Biol ; 65(3): 381-96, 2016 May.
Article in English | MEDLINE | ID: mdl-26821913

ABSTRACT

Under the multispecies coalescent model of molecular evolution, gene trees have independent evolutionary histories within a shared species tree. In comparison, supermatrix concatenation methods assume that gene trees share a single common genealogical history, thereby equating gene coalescence with species divergence. The multispecies coalescent is supported by previous studies which found that its predicted distributions fit empirical data, and that concatenation is not a consistent estimator of the species tree. *BEAST, a fully Bayesian implementation of the multispecies coalescent, is popular but computationally intensive, so the increasing size of phylogenetic data sets is both a computational challenge and an opportunity for better systematics. Using simulation studies, we characterize the scaling behavior of *BEAST, and enable quantitative prediction of the impact increasing the number of loci has on both computational performance and statistical accuracy. Follow-up simulations over a wide range of parameters show that the statistical performance of *BEAST relative to concatenation improves both as branch length is reduced and as the number of loci is increased. Finally, using simulations based on estimated parameters from two phylogenomic data sets, we compare the performance of a range of species tree and concatenation methods to show that using *BEAST with tens of loci can be preferable to using concatenation with thousands of loci. Our results provide insight into the practicalities of Bayesian species tree estimation, the number of loci required to obtain a given level of accuracy and the situations in which supermatrix or summary methods will be outperformed by the fully Bayesian multispecies coalescent.


Subject(s)
Classification/methods , Phylogeny , Software , Bayes Theorem , Biological Evolution , Data Interpretation, Statistical , Evolution, Molecular , Models, Genetic
18.
BMC Genomics ; 15: 870, 2014 Oct 06.
Article in English | MEDLINE | ID: mdl-25287121

ABSTRACT

BACKGROUND: Small, secreted signaling peptides work in parallel with phytohormones to control important aspects of plant growth and development. Genes from the C-TERMINALLY ENCODED PEPTIDE (CEP) family produce such peptides which negatively regulate plant growth, especially under stress, and affect other important developmental processes. To illuminate how the CEP gene family has evolved within the plant kingdom, including its emergence, diversification and variation between lineages, a comprehensive survey was undertaken to identify and characterize CEP genes in 106 plant genomes. RESULTS: Using a motif-based system developed for this study to identify canonical CEP peptide domains, a total of 916 CEP genes and 1,223 CEP domains were found in angiosperms and for the first time in gymnosperms. This defines a narrow band for the emergence of CEP genes in plants, from the divergence of lycophytes to the angiosperm/gymnosperm split. Both CEP genes and domains were found to have diversified in angiosperms, particularly in the Poaceae and Solanaceae plant families. Multispecies orthologous relationships were determined for 22% of identified CEP genes, and further analysis of those groups found selective constraints upon residues within the CEP peptide and within the previously little-characterized variable region. An examination of public Oryza sativa RNA-Seq datasets revealed an expression pattern that links OsCEP5 and OsCEP6 to panicle development and flowering, and CEP gene trees reveal these emerged from a duplication event associated with the Poaceae plant family. CONCLUSIONS: The characterization of the plant-family specific CEP genes OsCEP5 and OsCEP6, the association of CEP genes with angiosperm-specific development processes like panicle development, and the diversification of CEP genes in angiosperms provides further support for the hypothesis that CEP genes have been integral to the evolution of novel traits within the angiosperm lineage. Beyond these findings, the comprehensive set of CEP genes and their properties reported here will be a resource for future research on CEP genes and peptides.


Subject(s)
Evolution, Molecular , Genes, Plant , Magnoliopsida/genetics , Plant Proteins/genetics , Amino Acid Sequence , Base Composition , Genetic Variation , Molecular Sequence Data , Oryza/genetics , Phylogeny , Plant Proteins/chemistry , Plant Proteins/classification
19.
Cancer Discov ; 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38943574

ABSTRACT

Tumors frequently display high chromosomal instability and contain multiple copies of genomic regions. Here, we describe GRITIC, a generic method for timing genomic gains leading to complex copy number states, using single-sample bulk whole-genome sequencing data. By applying GRITIC to 6,091 tumors, we found that non-parsimonious evolution is frequent in the formation of complex copy number states in genome-doubled tumors. We measured chromosomal instability before and after genome duplication in human tumors and found that late genome doubling was followed by an increase in the rate of copy number gain. Copy number gains often accumulate as punctuated bursts, commonly after genome doubling. We infer that genome duplications typically affect the landscape of copy number losses, while only minimally impacting copy number gains. In summary, GRITIC is a novel copy number gain timing framework that permits the analysis of copy number evolution in chromosomally unstable tumors.

20.
Planta ; 238(1): 91-105, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23572382

ABSTRACT

Plant root architecture is regulated by the initiation and modulation of cell division in regions containing pluripotent stem cells known as meristems. In roots, meristems are formed early in embryogenesis, in the case of the root apical meristem (RAM), and during organogenesis at the site of lateral root or, in legumes, nodule formation. Root meristems can also be generated in vitro from leaf explants cultures supplemented with auxin. microRNAs (miRNAs) have emerged as regulators of many key biological functions in plants including root development. To identify key miRNAs involved in root meristem formation in Medicago truncatula, we used deep sequencing to compare miRNA populations. Comparisons were made between: (1) the root tip (RT), containing the RAM and the elongation zone (EZ) tissue and (2) root forming callus (RFC) and non-root forming callus (NRFC). We identified 83 previously reported miRNAs, 24 new to M. truncatula, in 44 families. For the first time in M. truncatula, members of conserved miRNA families miR165, miR181 and miR397 were found. Bioinformatic analysis identified 38 potential novel miRNAs. Selected miRNAs and targets were validated using Taqman miRNA assays and 5' RACE. Many miRNAs were differentially expressed between tissues, particularly RFC and NRFC. Target prediction revealed a number of miRNAs to target genes previously shown to be differentially expressed between RT and EZ or RFC and NRFC and important in root development. Additionally, we predict the miRNA/target relationships for miR397 and miR160 to be conserved in M. truncatula. Amongst the predictions, were AUXIN RESPONSE FACTOR 10, targeted by miR160 and a LACCASE-like gene, targeted by miR397, both are miRNA/target pairings conserved in other species.


Subject(s)
Gene Expression Profiling/methods , Medicago truncatula/genetics , MicroRNAs/genetics , Plant Roots/growth & development , Plant Roots/genetics , Base Sequence , Conserved Sequence , Gene Expression Regulation, Plant , High-Throughput Nucleotide Sequencing , Medicago truncatula/growth & development , Meristem/genetics , Reproducibility of Results , Tissue Culture Techniques , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL