Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 102
Filtrar
1.
Nature ; 633(8029): 380-388, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39112710

RESUMO

Selfish genetic elements contribute to hybrid incompatibility and bias or 'drive' their own transmission1,2. Chromosomal drive typically functions in asymmetric female meiosis, whereas gene drive is normally post-meiotic and typically found in males. Here, using single-molecule and single-pollen genome sequencing, we describe Teosinte Pollen Drive, an instance of gene drive in hybrids between maize (Zea mays ssp. mays) and teosinte mexicana (Z. mays ssp. mexicana) that depends on RNA interference (RNAi). 22-nucleotide small RNAs from a non-coding RNA hairpin in mexicana depend on Dicer-like 2 (Dcl2) and target Teosinte Drive Responder 1 (Tdr1), which encodes a lipase required for pollen viability. Dcl2, Tdr1 and the hairpin are in tight pseudolinkage on chromosome 5, but only when transmitted through the male. Introgression of mexicana into early cultivated maize is thought to have been critical to its geographical dispersal throughout the Americas3, and a tightly linked inversion in mexicana spans a major domestication sweep in modern maize4. A survey of maize traditional varieties and sympatric populations of teosinte mexicana reveals correlated patterns of admixture among unlinked genes required for RNAi on at least four chromosomes that are also subject to gene drive in pollen from synthetic hybrids. Teosinte Pollen Drive probably had a major role in maize domestication and diversification, and offers an explanation for the widespread abundance of 'self' small RNAs in the germ lines of plants and animals.


Assuntos
Domesticação , Pólen , Interferência de RNA , Zea mays , Zea mays/genética , Pólen/genética , Hibridização Genética , Introgressão Genética , Genoma de Planta
2.
Genes Dev ; 2022 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-35981753

RESUMO

Promoter-proximal RNA Pol II pausing is a critical step in transcriptional control. Pol II pausing has been predominantly studied in tissue culture systems. While Pol II pausing has been shown to be required for mammalian development, the phenotypic and mechanistic details of this requirement are unknown. Here, we found that loss of Pol II pausing stalls pluripotent state transitions within the epiblast of the early mouse embryo. Using Nelfb -/- mice and a NELFB degron mouse pluripotent stem cell model, we show that embryonic stem cells (ESCs) representing the naïve state of pluripotency successfully initiate a transition program but fail to balance levels of induced and repressed genes and enhancers in the absence of NELF. We found an increase in chromatin-associated NELF during transition from the naïve to later pluripotent states. Overall, our work defines the acute and long-term molecular consequences of NELF loss and reveals a role for Pol II pausing in the pluripotency continuum as a modulator of cell state transitions.

3.
Cell ; 145(4): 622-34, 2011 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-21549415

RESUMO

We report the immediate effects of estrogen signaling on the transcriptome of breast cancer cells using global run-on and sequencing (GRO-seq). The data were analyzed using a new bioinformatic approach that allowed us to identify transcripts directly from the GRO-seq data. We found that estrogen signaling directly regulates a strikingly large fraction of the transcriptome in a rapid, robust, and unexpectedly transient manner. In addition to protein-coding genes, estrogen regulates the distribution and activity of all three RNA polymerases and virtually every class of noncoding RNA that has been described to date. We also identified a large number of previously undetected estrogen-regulated intergenic transcripts, many of which are found proximal to estrogen receptor binding sites. Collectively, our results provide the most comprehensive measurement of the primary and immediate estrogen effects to date and a resource for understanding rapid signal-dependent transcription in other systems.


Assuntos
Neoplasias da Mama/genética , Biologia Computacional/métodos , Estrogênios/metabolismo , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Linhagem Celular Tumoral , Receptor alfa de Estrogênio/metabolismo , Técnicas Genéticas , Humanos , RNA não Traduzido/genética , Análise de Sequência de DNA , Transdução de Sinais
4.
PLoS Genet ; 19(11): e1011032, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37934781

RESUMO

Investigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution. By applying an established domain-adaptation technique based on a gradient reversal layer (GRL), originally introduced for image classification, we show that the effects of simulation mis-specification can be substantially mitigated. We focus our analysis on two state-of-the-art deep-learning population genetic methods-SIA, which infers positive selection from features of the ancestral recombination graph (ARG), and ReLERNN, which infers recombination rates from genotype matrices. In the case of SIA, the domain adaptive framework also compensates for ARG inference error. Using the domain-adaptive SIA (dadaSIA) model, we estimate improved selection coefficients at selected loci in the 1000 Genomes CEU population. We anticipate that domain adaptation will prove to be widely applicable in the growing use of supervised machine learning in population genetics.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Aprendizado de Máquina Supervisionado , Simulação por Computador , Genética Populacional
5.
Nucleic Acids Res ; 51(21): e106, 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-37889042

RESUMO

In metazoans, both transcription initiation and the escape of RNA polymerase (RNAP) from promoter-proximal pausing are key rate-limiting steps in gene expression. These processes play out at physically proximal sites on the DNA template and appear to influence one another through steric interactions. Here, we examine the dynamics of these processes using a combination of statistical modeling, simulation, and analysis of real nascent RNA sequencing data. We develop a simple probabilistic model that jointly describes the kinetics of transcription initiation, pause-escape, and elongation, and the generation of nascent RNA sequencing read counts under steady-state conditions. We then extend this initial model to allow for variability across cells in promoter-proximal pause site locations and steric hindrance of transcription initiation from paused RNAPs. In an extensive series of simulations, we show that this model enables accurate estimation of initiation and pause-escape rates. Furthermore, we show by simulation and analysis of real data that pause-escape is often strongly rate-limiting and that steric hindrance can dramatically reduce initiation rates. Our modeling framework is applicable to a variety of inference problems, and our software for estimation and simulation is freely available.


Assuntos
RNA Polimerases Dirigidas por DNA , Transcrição Gênica , Humanos , RNA Polimerases Dirigidas por DNA/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Regiões Promotoras Genéticas , RNA , Sequência de Bases
6.
PLoS Genet ; 18(11): e1010474, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36318577

RESUMO

Insular organisms often evolve predictable phenotypes, like flightlessness, extreme body sizes, or increased melanin deposition. The evolutionary forces and molecular targets mediating these patterns remain mostly unknown. Here we study the Chestnut-bellied Monarch (Monarcha castaneiventris) from the Solomon Islands, a complex of closely related subspecies in the early stages of speciation. On the large island of Makira M. c. megarhynchus has a chestnut belly, whereas on the small satellite islands of Ugi, and Santa Ana and Santa Catalina (SA/SC) M. c. ugiensis is entirely iridescent blue-black (i.e., melanic). Melanism has likely evolved twice, as the Ugi and SA/SC populations were established independently. To investigate the genetic basis of melanism on each island we generated whole genome sequence data from all three populations. Non-synonymous mutations at the MC1R pigmentation gene are associated with melanism on SA/SC, while ASIP, an antagonistic ligand of MC1R, is associated with melanism on Ugi. Both genes show evidence of selective sweeps in traditional summary statistics and statistics derived from the ancestral recombination graph (ARG). Using the ARG in combination with machine learning, we inferred selection strength, timing of onset and allele frequency trajectories. MC1R shows evidence of a recent, strong, soft selective sweep. The region including ASIP shows more complex signatures; however, we find evidence for sweeps in mutations near ASIP, which are comparatively older than those on MC1R and have been under relatively strong selection. Overall, our study shows convergent melanism results from selective sweeps at independent molecular targets, evolving in taxa where coloration likely mediates reproductive isolation with the neighboring chestnut-bellied subspecies.


Assuntos
Melanose , Passeriformes , Animais , Receptor Tipo 1 de Melanocortina/genética , Pigmentação/genética , Melanose/genética , Passeriformes/genética , Frequência do Gene
7.
Genes Dev ; 31(18): 1841-1846, 2017 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-29051389

RESUMO

Relatively little is known about the in vivo functions of newly emerging genes, especially in metazoans. Although prior RNAi studies reported prevalent lethality among young gene knockdowns, our phylogenomic analyses reveal that young Drosophila genes are frequently restricted to the nonessential male reproductive system. We performed large-scale CRISPR/Cas9 mutagenesis of "conserved, essential" and "young, RNAi-lethal" genes and broadly confirmed the lethality of the former but the viability of the latter. Nevertheless, certain young gene mutants exhibit defective spermatogenesis and/or male sterility. Moreover, we detected widespread signatures of positive selection on young male-biased genes. Thus, young genes have a preferential impact on male reproductive system function.


Assuntos
Drosophila melanogaster/genética , Fertilidade/genética , Genes Essenciais/fisiologia , Genes de Insetos/fisiologia , Reprodução/genética , Animais , Sistemas CRISPR-Cas/genética , Evolução Molecular , Mutação da Fase de Leitura , Expressão Gênica , Perfilação da Expressão Gênica , Técnicas de Silenciamento de Genes , Genes Letais/fisiologia , Infertilidade Masculina/genética , Masculino , Filogenia , Interferência de RNA , Espermatogênese/genética , Testículo/anatomia & histologia , Testículo/metabolismo
8.
J Mammary Gland Biol Neoplasia ; 29(1): 3, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38289401

RESUMO

During female adolescence and pregnancy, rising levels of hormones result in a cyclic source of signals that control the development of mammary tissue. While such alterations are well understood from a whole-gland perspective, the alterations that such hormones bring to organoid cultures derived from mammary glands have yet to be fully mapped. This is of special importance given that organoids are considered suitable systems to understand cross species breast development. Here we utilized single-cell transcriptional profiling to delineate responses of murine and human normal breast organoid systems to female hormones across evolutionary distinct species. Collectively, our study represents a molecular atlas of epithelial dynamics in response to estrogen and pregnancy hormones.


Assuntos
Mama , Estrogênios , Adolescente , Gravidez , Humanos , Animais , Camundongos , Feminino , Organoides
9.
Mol Biol Evol ; 39(1)2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34888675

RESUMO

Detecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selection Inference using the Ancestral recombination graph (SIA). Built on a Long Short-Term Memory (LSTM) architecture, a particular type of a Recurrent Neural Network (RNN), SIA can be trained to explicitly infer a full range of selection coefficients, as well as the allele frequency trajectory and time of selection onset. We benchmarked SIA extensively on simulations under a European human demographic model, and found that it performs as well or better as some of the best available methods, including state-of-the-art machine-learning and ARG-based methods. In addition, we used SIA to estimate selection coefficients at several loci associated with human phenotypes of interest. SIA detected novel signals of selection particular to the European (CEU) population at the MC1R and ABCC11 loci. In addition, it recapitulated signals of selection at the LCT locus and several pigmentation-related genes. Finally, we reanalyzed polymorphism data of a collection of recently radiated southern capuchino seedeater taxa in the genus Sporophila to quantify the strength of selection and improved the power of our previous methods to detect partial soft sweeps. Overall, SIA uses deep learning to leverage the ARG and thereby provides new insight into how selective sweeps shape genomic diversity.


Assuntos
Aprendizado Profundo , Seleção Genética , Genética Populacional , Modelos Genéticos , Recombinação Genética
10.
Trends Genet ; 36(4): 243-258, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-31954511

RESUMO

Methods to detect signals of natural selection from genomic data have traditionally emphasized the use of simple summary statistics. Here, we review a new generation of methods that consider combinations of conventional summary statistics and/or richer features derived from inferred gene trees and ancestral recombination graphs (ARGs). We also review recent advances in methods for population genetic simulation and ARG reconstruction. Finally, we describe opportunities for future work on a variety of related topics, including the genetics of speciation, estimation of selection coefficients, and inference of selection on polygenic traits. Together, these emerging methods offer promising new directions in the study of natural selection.


Assuntos
Evolução Molecular , Genética Populacional/estatística & dados numéricos , Recombinação Genética/genética , Seleção Genética/genética , Algoritmos , Simulação por Computador , Modelos Genéticos , Herança Multifatorial/genética , Filogenia
11.
PLoS Genet ; 16(8): e1008895, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32760067

RESUMO

The sequencing of Neanderthal and Denisovan genomes has yielded many new insights about interbreeding events between extinct hominins and the ancestors of modern humans. While much attention has been paid to the relatively recent gene flow from Neanderthals and Denisovans into modern humans, other instances of introgression leave more subtle genomic evidence and have received less attention. Here, we present a major extension of the ARGweaver algorithm, called ARGweaver-D, which can infer local genetic relationships under a user-defined demographic model that includes population splits and migration events. This Bayesian algorithm probabilistically samples ancestral recombination graphs (ARGs) that specify not only tree topologies and branch lengths along the genome, but also indicate migrant lineages. The sampled ARGs can therefore be parsed to produce probabilities of introgression along the genome. We show that this method is well powered to detect the archaic migration into modern humans, even with only a few samples. We then show that the method can also detect introgressed regions stemming from older migration events, or from unsampled populations. We apply it to human, Neanderthal, and Denisovan genomes, looking for signatures of older proposed migration events, including ancient humans into Neanderthal, and unknown archaic hominins into Denisovans. We identify 3% of the Neanderthal genome that is putatively introgressed from ancient humans, and estimate that the gene flow occurred between 200-300kya. We find no convincing evidence that negative selection acted against these regions. Finally, we predict that 1% of the Denisovan genome was introgressed from an unsequenced, but highly diverged, archaic hominin ancestor. About 15% of these "super-archaic" regions-comprising at least about 4Mb-were, in turn, introgressed into modern humans and continue to exist in the genomes of people alive today.


Assuntos
Fluxo Gênico , Modelos Genéticos , Homem de Neandertal/genética , População/genética , Recombinação Genética , Animais , Evolução Molecular , Migração Humana , Humanos
12.
Proc Natl Acad Sci U S A ; 117(48): 30554-30565, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-33199636

RESUMO

Numerous studies of emerging species have identified genomic "islands" of elevated differentiation against a background of relative homogeneity. The causes of these islands remain unclear, however, with some signs pointing toward "speciation genes" that locally restrict gene flow and others suggesting selective sweeps that have occurred within nascent species after speciation. Here, we examine this question through the lens of genome sequence data for five species of southern capuchino seedeaters, finch-like birds from South America that have undergone a species radiation during the last ∼50,000 generations. By applying newly developed statistical methods for ancestral recombination graph inference and machine-learning methods for the prediction of selective sweeps, we show that previously identified islands of differentiation in these birds appear to be generally associated with relatively recent, species-specific selective sweeps, most of which are predicted to be soft sweeps acting on standing genetic variation. Many of these sweeps coincide with genes associated with melanin-based variation in plumage, suggesting a prominent role for sexual selection. At the same time, a few loci also exhibit indications of possible selection against gene flow. These observations shed light on the complex manner in which natural selection shapes genome sequences during speciation.


Assuntos
Ilhas Genômicas , Modelos Genéticos , Animais , Biodiversidade , Variação Genética , Aprendizado de Máquina
13.
Genome Res ; 29(8): 1310-1321, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31249063

RESUMO

A central challenge in human genomics is to understand the cellular, evolutionary, and clinical significance of genetic variants. Here, we introduce a unified population-genetic and machine-learning model, called Linear Allele-Specific Selection InferencE (LASSIE), for estimating the fitness effects of all observed and potential single-nucleotide variants, based on polymorphism data and predictive genomic features. We applied LASSIE to 51 high-coverage genome sequences annotated with 33 genomic features and constructed a map of allele-specific selection coefficients across all protein-coding sequences in the human genome. This map is generally consistent with previous inferences of the bulk distribution of fitness effects but reveals pervasive weak negative selection against synonymous mutations. In addition, the estimated selection coefficients are highly predictive of inherited pathogenic variants and cancer driver mutations, outperforming state-of-the-art variant prioritization methods. By contrasting our estimated model with ultrahigh coverage ExAC exome-sequencing data, we identified 1118 genes under unusually strong negative selection, which tend to be exclusively expressed in the central nervous system or associated with autism spectrum disorder, as well as 773 genes under unusually weak selection, which tend to be associated with metabolism. This combination of classical population genetic theory with modern machine-learning and large-scale genomic data is a powerful paradigm for the study of both human evolution and disease.


Assuntos
Transtorno do Espectro Autista/genética , Genoma Humano , Aprendizado de Máquina , Modelos Genéticos , Neoplasias/genética , Proteoma/genética , Alelos , Transtorno do Espectro Autista/metabolismo , Transtorno do Espectro Autista/patologia , Sequência de Bases , Aptidão Genética , Variação Genética , Genética Populacional , Genômica , Humanos , Padrões de Herança , Neoplasias/metabolismo , Neoplasias/patologia , Polimorfismo de Nucleotídeo Único , Proteoma/metabolismo , Seleção Genética
14.
Bioinformatics ; 37(24): 4727-4736, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34382072

RESUMO

MOTIVATION: Quantification of isoform abundance has been extensively studied at the mature RNA level using RNA-seq but not at the level of precursor RNAs using nascent RNA sequencing. RESULTS: We address this problem with a new computational method called Deconvolution of Expression for Nascent RNA-sequencing data (DENR), which models nascent RNA-sequencing read-counts as a mixture of user-provided isoforms. The baseline algorithm is enhanced by machine-learning predictions of active transcription start sites and an adjustment for the typical 'shape profile' of read-counts along a transcription unit. We show that DENR outperforms simple read-count-based methods for estimating gene and isoform abundances, and that transcription of multiple pre-RNA isoforms per gene is widespread, with frequent differences between cell types. In addition, we provide evidence that a majority of human isoform diversity derives from primary transcription rather than from post-transcriptional processes. AVAILABILITY AND IMPLEMENTATION: DENR and nascentRNASim are freely available at https://github.com/CshlSiepelLab/DENR (version v1.0.0) and https://github.com/CshlSiepelLab/nascentRNASim (version v0.3.0). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Isoformas de RNA , RNA , Humanos , Isoformas de RNA/genética , Software , Isoformas de Proteínas/genética , Análise de Sequência de RNA/métodos , Fatores de Iniciação em Eucariotos/genética
15.
Nature ; 530(7591): 429-33, 2016 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-26886800

RESUMO

It has been shown that Neanderthals contributed genetically to modern humans outside Africa 47,000-65,000 years ago. Here we analyse the genomes of a Neanderthal and a Denisovan from the Altai Mountains in Siberia together with the sequences of chromosome 21 of two Neanderthals from Spain and Croatia. We find that a population that diverged early from other modern humans in Africa contributed genetically to the ancestors of Neanderthals from the Altai Mountains roughly 100,000 years ago. By contrast, we do not detect such a genetic contribution in the Denisovan or the two European Neanderthals. We conclude that in addition to later interbreeding events, the ancestors of Neanderthals from the Altai Mountains and early modern humans met and interbred, possibly in the Near East, many thousands of years earlier than previously thought.


Assuntos
Fluxo Gênico/genética , Homem de Neandertal/genética , Altitude , Animais , Teorema de Bayes , Cromossomos Humanos Par 21/genética , Croácia/etnologia , Genoma Humano/genética , Genômica , Haplótipos/genética , Heterozigoto , Humanos , Hibridização Genética/genética , Filogenia , Densidade Demográfica , Sibéria , Espanha/etnologia , Fatores de Tempo
16.
BMC Biol ; 19(1): 30, 2021 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-33588838

RESUMO

BACKGROUND: The concentrations of distinct types of RNA in cells result from a dynamic equilibrium between RNA synthesis and decay. Despite the critical importance of RNA decay rates, current approaches for measuring them are generally labor-intensive, limited in sensitivity, and/or disruptive to normal cellular processes. Here, we introduce a simple method for estimating relative RNA half-lives that is based on two standard and widely available high-throughput assays: Precision Run-On sequencing (PRO-seq) and RNA sequencing (RNA-seq). RESULTS: Our method treats PRO-seq as a measure of transcription rate and RNA-seq as a measure of RNA concentration, and estimates the rate of RNA decay required for a steady-state equilibrium. We show that this approach can be used to assay relative RNA half-lives genome-wide, with good accuracy and sensitivity for both coding and noncoding transcription units. Using a structural equation model (SEM), we test several features of transcription units, nearby DNA sequences, and nearby epigenomic marks for associations with RNA stability after controlling for their effects on transcription. We find that RNA splicing-related features are positively correlated with RNA stability, whereas features related to miRNA binding and DNA methylation are negatively correlated with RNA stability. Furthermore, we find that a measure based on U1 binding and polyadenylation sites distinguishes between unstable noncoding and stable coding transcripts but is not predictive of relative stability within the mRNA or lincRNA classes. We also identify several histone modifications that are associated with RNA stability. CONCLUSION: We introduce an approach for estimating the relative half-lives of individual RNAs. Together, our estimation method and systematic analysis shed light on the pervasive impacts of RNA stability on cellular RNA concentrations.


Assuntos
Instabilidade Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Estabilidade de RNA , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Humanos , RNA-Seq/métodos
17.
J Mammary Gland Biol Neoplasia ; 26(1): 43-66, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33988830

RESUMO

The developing mammary gland depends on several transcription-dependent networks to define cellular identities and differentiation trajectories. Recent technological advancements that allow for single-cell profiling of gene expression have provided an initial picture into the epithelial cellular heterogeneity across the diverse stages of gland maturation. Still, a deeper dive into expanded molecular signatures would improve our understanding of the diversity of mammary epithelial and non-epithelial cellular populations across different tissue developmental stages, mouse strains and mammalian species. Here, we combined differential mammary gland fractionation approaches and transcriptional profiles obtained from FACS-isolated mammary cells to improve our definitions of mammary-resident, cellular identities at the single-cell level. Our approach yielded a series of expression signatures that illustrate the heterogeneity of mammary epithelial cells, specifically those of the luminal fate, and uncovered transcriptional changes to their lineage-defined, cellular states that are induced during gland development. Our analysis also provided molecular signatures that identified non-epithelial mammary cells, including adipocytes, fibroblasts and rare immune cells. Lastly, we extended our study to elucidate expression signatures of human, breast-resident cells, a strategy that allowed for the cross-species comparison of mammary epithelial identities. Collectively, our approach improved the existing signatures of normal mammary epithelial cells, as well as elucidated the diversity of non-epithelial cells in murine and human breast tissue. Our study provides a useful resource for future studies that use single-cell molecular profiling strategies to understand normal and malignant breast development.


Assuntos
Células Epiteliais/fisiologia , Perfilação da Expressão Gênica/métodos , Glândulas Mamárias Animais/fisiologia , Glândulas Mamárias Humanas/fisiologia , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma , Animais , Linhagem da Célula/fisiologia , Células Epiteliais/citologia , Feminino , Humanos , Glândulas Mamárias Animais/citologia , Glândulas Mamárias Humanas/citologia , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL
18.
Mol Biol Evol ; 37(7): 2137-2152, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32176292

RESUMO

Evolutionary changes in gene expression are often driven by gains and losses of cis-regulatory elements (CREs). The dynamics of CRE evolution can be examined using multispecies epigenomic data, but so far such analyses have generally been descriptive and model-free. Here, we introduce a probabilistic modeling framework for the evolution of CREs that operates directly on raw chromatin immunoprecipitation and sequencing (ChIP-seq) data and fully considers the phylogenetic relationships among species. Our framework includes a phylogenetic hidden Markov model, called epiPhyloHMM, for identifying the locations of multiply aligned CREs, and a combined phylogenetic and generalized linear model, called phyloGLM, for accounting for the influence of a rich set of genomic features in describing their evolutionary dynamics. We apply these methods to previously published ChIP-seq data for the H3K4me3 and H3K27ac histone modifications in liver tissue from nine mammals. We find that enhancers are gained and lost during mammalian evolution at about twice the rate of promoters, and that turnover rates are negatively correlated with DNA sequence conservation, expression level, and tissue breadth, and positively correlated with distance from the transcription start site, consistent with previous findings. In addition, we find that the predicted dosage sensitivity of target genes positively correlates with DNA sequence constraint in CREs but not with turnover rates, perhaps owing to differences in the effect sizes of the relevant mutations. Altogether, our probabilistic modeling framework enables a variety of powerful new analyses.


Assuntos
Epigenômica/métodos , Evolução Molecular , Modelos Genéticos , Filogenia , Elementos Reguladores de Transcrição , Animais , Sequenciamento de Cromatina por Imunoprecipitação , Código das Histonas/genética , Mamíferos/genética
19.
Genome Res ; 28(1): 52-65, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29233922

RESUMO

To assess miRNA evolution across the Drosophila genus, we analyzed several billion small RNA reads across 12 fruit fly species. These data permit comprehensive curation of species- and clade-specific variation in miRNA identity, abundance, and processing. Among well-conserved miRNAs, we observed unexpected cases of clade-specific variation in 5' end precision, occasional antisense loci, and putatively noncanonical loci. We also used strict criteria to identify a large set (649) of novel, evolutionarily restricted miRNAs. Within the bulk collection of species-restricted miRNAs, two notable subpopulations are splicing-derived mirtrons and testes-restricted, recently evolved, clustered (TRC) canonical miRNAs. We quantified miRNA birth and death using our annotation and a phylogenetic model for estimating rates of miRNA turnover. We observed striking differences in birth and death rates across miRNA classes defined by biogenesis pathway, genomic clustering, and tissue restriction, and even identified flux heterogeneity among Drosophila clades. In particular, distinct molecular rationales underlie the distinct evolutionary behavior of different miRNA classes. Mirtrons are associated with high rates of 3' untemplated addition, a mechanism that impedes their biogenesis, whereas TRC miRNAs appear to evolve under positive selection. Altogether, these data reveal miRNA diversity among Drosophila species and principles underlying their emergence and evolution.


Assuntos
Regiões 3' não Traduzidas , Drosophila/genética , Evolução Molecular , Perfilação da Expressão Gênica , Loci Gênicos , MicroRNAs/genética , Animais , Especificidade da Espécie
20.
Mol Cell ; 50(2): 212-22, 2013 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-23523369

RESUMO

RNA polymerase II (Pol II) transcribes hundreds of kilobases of DNA, limiting the production of mRNAs and lncRNAs. We used global run-on sequencing (GRO-seq) to measure the rates of transcription by Pol II following gene activation. Elongation rates vary as much as 4-fold at different genomic loci and in response to two distinct cellular signaling pathways (i.e., 17ß-estradiol [E2] and TNF-α). The rates are slowest near the promoter and increase during the first ~15 kb transcribed. Gene body elongation rates correlate with Pol II density, resulting in systematically higher rates of transcript production at genes with higher Pol II density. Pol II dynamics following short inductions indicate that E2 stimulates gene expression by increasing Pol II initiation, whereas TNF-α reduces Pol II residence time at pause sites. Collectively, our results identify previously uncharacterized variation in the rate of transcription and highlight elongation as an important, variable, and regulated rate-limiting step during transcription.


Assuntos
RNA Polimerase II/metabolismo , RNA Mensageiro/biossíntese , Transdução de Sinais , Iniciação da Transcrição Genética , Estradiol/farmacologia , Estradiol/fisiologia , Humanos , Cinética , Células MCF-7 , Regiões Promotoras Genéticas , RNA Polimerase II/fisiologia , RNA Mensageiro/genética , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica , Ativação Transcricional , Transcriptoma , Fator de Necrose Tumoral alfa/farmacologia , Fator de Necrose Tumoral alfa/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA