RESUMEN
In eukaryotes, capped RNAs include long transcripts such as messenger RNAs and long noncoding RNAs, as well as shorter transcripts such as spliceosomal RNAs, small nucleolar RNAs, and enhancer RNAs. Long capped transcripts can be profiled using cap analysis gene expression (CAGE) sequencing and other methods. Here, we describe a sequencing library preparation protocol for short capped RNAs, apply it to a differentiation time course of the human cell line THP-1, and systematically compare the landscape of short capped RNAs to that of long capped RNAs. Transcription initiation peaks associated with genes in the sense direction have a strong preference to produce either long or short capped RNAs, with one out of six peaks detected in the short capped RNA libraries only. Gene-associated short capped RNAs have highly specific 3' ends, typically overlapping splice sites. Enhancers also preferentially generate either short or long capped RNAs, with 10% of enhancers observed in the short capped RNA libraries only. Enhancers producing either short or long capped RNAs show enrichment for GWAS-associated disease SNPs. We conclude that deep sequencing of short capped RNAs reveals new families of noncoding RNAs and elucidates the diversity of transcripts generated at known and novel promoters and enhancers.
RESUMEN
A core promoter is a stretch of DNA surrounding the transcription start site (TSS) that integrates regulatory inputs and recruits general transcription factors to initiate transcription. The nature and causative relationship of the DNA sequence and chromatin signals that govern the selection of most TSSs by RNA polymerase II remain unresolved. Maternal to zygotic transition represents the most marked change of the transcriptome repertoire in the vertebrate life cycle. Early embryonic development in zebrafish is characterized by a series of transcriptionally silent cell cycles regulated by inherited maternal gene products: zygotic genome activation commences at the tenth cell cycle, marking the mid-blastula transition. This transition provides a unique opportunity to study the rules of TSS selection and the hierarchy of events linking transcription initiation with key chromatin modifications. We analysed TSS usage during zebrafish early embryonic development at high resolution using cap analysis of gene expression, and determined the positions of H3K4me3-marked promoter-associated nucleosomes. Here we show that the transition from the maternal to zygotic transcriptome is characterized by a switch between two fundamentally different modes of defining transcription initiation, which drive the dynamic change of TSS usage and promoter shape. A maternal-specific TSS selection, which requires an A/T-rich (W-box) motif, is replaced with a zygotic TSS selection grammar characterized by broader patterns of dinucleotide enrichments, precisely aligned with the first downstream (+1) nucleosome. The developmental dynamics of the H3K4me3-marked nucleosomes reveal their DNA-sequence-associated positioning at promoters before zygotic transcription and subsequent transcription-independent adjustment to the final position downstream of the zygotic TSS. The two TSS-defining grammars coexist, often physically overlapping, in core promoters of constitutively expressed genes to enable their expression in the two regulatory environments. The dissection of overlapping core promoter determinants represents a framework for future studies of promoter structure and function across different regulatory contexts.
Asunto(s)
Regiones Promotoras Genéticas/genética , Sitio de Iniciación de la Transcripción , Pez Cebra/genética , Animales , Secuencia de Bases , Embrión no Mamífero/embriología , Embrión no Mamífero/metabolismo , Femenino , Regulación del Desarrollo de la Expresión Génica/genética , Histonas/metabolismo , Metilación , Madres , Nucleosomas/genética , Iniciación de la Transcripción Genética , Transcriptoma/genética , Pez Cebra/embriología , Cigoto/metabolismoRESUMEN
Animal transcriptomes are dynamic, with each cell type, tissue and organ system expressing an ensemble of transcript isoforms that give rise to substantial diversity. Here we have identified new genes, transcripts and proteins using poly(A)+ RNA sequencing from Drosophila melanogaster in cultured cell lines, dissected organ systems and under environmental perturbations. We found that a small set of mostly neural-specific genes has the potential to encode thousands of transcripts each through extensive alternative promoter usage and RNA splicing. The magnitudes of splicing changes are larger between tissues than between developmental stages, and most sex-specific splicing is gonad-specific. Gonads express hundreds of previously unknown coding and long non-coding RNAs (lncRNAs), some of which are antisense to protein-coding genes and produce short regulatory RNAs. Furthermore, previously identified pervasive intergenic transcription occurs primarily within newly identified introns. The fly transcriptome is substantially more complex than previously recognized, with this complexity arising from combinatorial usage of promoters, splice sites and polyadenylation sites.
Asunto(s)
Drosophila melanogaster/genética , Perfilación de la Expresión Génica , Transcriptoma/genética , Empalme Alternativo/genética , Animales , Drosophila melanogaster/anatomía & histología , Drosophila melanogaster/citología , Femenino , Masculino , Anotación de Secuencia Molecular , Tejido Nervioso/metabolismo , Especificidad de Órganos , Poli A/genética , Poliadenilación , Regiones Promotoras Genéticas/genética , ARN Largo no Codificante/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Caracteres Sexuales , Estrés Fisiológico/genéticaRESUMEN
An increasing number of noncoding RNAs (ncRNAs) have been implicated in various human diseases including cancer; however, the ncRNA transcriptome of hepatocellular carcinoma (HCC) is largely unexplored. We used CAGE to map transcription start sites across various types of human and mouse HCCs with emphasis on ncRNAs distant from protein-coding genes. Here, we report that retroviral LTR promoters, expressed in healthy tissues such as testis and placenta but not liver, are widely activated in liver tumors. Despite HCC heterogeneity, a subset of LTR-derived ncRNAs were more than 10-fold up-regulated in the vast majority of samples. HCCs with a high LTR activity mostly had a viral etiology, were less differentiated, and showed higher risk of recurrence. ChIP-seq data show that MYC and MAX are associated with ncRNA deregulation. Globally, CAGE enabled us to build a mammalian promoter map for HCC, which uncovers a new layer of complexity in HCC genomics.
Asunto(s)
Carcinoma Hepatocelular/etiología , Perfilación de la Expresión Génica , Neoplasias Hepáticas/etiología , Regiones Promotoras Genéticas , ARN no Traducido/genética , Secuencias Repetidas Terminales , Sitio de Iniciación de la Transcripción , Subfamilia B de Transportador de Casetes de Unión a ATP/genética , Animales , Carcinoma Hepatocelular/patología , Línea Celular Tumoral , Transformación Celular Neoplásica/genética , Transformación Celular Viral , Biología Computacional/métodos , Modelos Animales de Enfermedad , Progresión de la Enfermedad , Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias Hepáticas/patología , Ratones , Ratones Noqueados , Unión Proteica , Factores de Transcripción/metabolismo , Transcriptoma , Miembro 4 de la Subfamilia B de Casete de Unión a ATPRESUMEN
Underlying the complexity of the mammalian brain is its network of neuronal connections, but also the molecular networks of signaling pathways, protein interactions, and regulated gene expression within each individual neuron. The diversity and complexity of the spatially intermingled neurons pose a serious challenge to the identification and quantification of single neuron components. To address this challenge, we present a novel approach for the study of the ribosome-associated transcriptome-the translatome-from selected subcellular domains of specific neurons, and apply it to the Purkinje cells (PCs) in the rat cerebellum. We combined microdissection, translating ribosome affinity purification (TRAP) in nontransgenic animals, and quantitative nanoCAGE sequencing to obtain a snapshot of RNAs bound to cytoplasmic or rough endoplasmic reticulum (rER)-associated ribosomes in the PC and its dendrites. This allowed us to discover novel markers of PCs, to determine structural aspects of genes, to find hitherto uncharacterized transcripts, and to quantify biophysically relevant genes of membrane proteins controlling ion homeostasis and neuronal electrical activities.
Asunto(s)
Perfilación de la Expresión Génica , Células de Purkinje/metabolismo , Animales , Sitios de Unión , Mapeo Cromosómico , Análisis por Conglomerados , Citoplasma/metabolismo , Dendritas/metabolismo , Retículo Endoplásmico Rugoso/metabolismo , Familia de Multigenes , Regiones Promotoras Genéticas , Biosíntesis de Proteínas , ARN no Traducido/genética , ARN no Traducido/metabolismo , Ratas , Ribosomas/fisiología , TranscriptomaRESUMEN
Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.
Asunto(s)
Biología Computacional/métodos , Drosophila melanogaster/genética , Perfilación de la Expresión Génica , Anotación de Secuencia Molecular , Transcriptoma , Animales , Análisis por Conglomerados , Drosophila melanogaster/clasificación , Evolución Molecular , Exones , Femenino , Genoma de los Insectos , Humanos , Masculino , Motivos de Nucleótidos , Filogenia , Posición Específica de Matrices de Puntuación , Regiones Promotoras Genéticas , Edición de ARN , Sitios de Empalme de ARN , Empalme del ARN , Reproducibilidad de los Resultados , Sitio de Iniciación de la TranscripciónRESUMEN
Hepatitis B virus (HBV) is a major cause of liver diseases, including hepatocellular carcinoma (HCC), and more than 650,000 people die annually due to HBV-associated liver failure. Extensive studies of individual promoters have revealed that heterogeneous RNA 5' ends contribute to the complexity of HBV transcriptome and proteome. Here, we provide a comprehensive map of HBV transcription start sites (TSSs) in human liver, HCC, and blood, as well as several experimental replication systems, at a single-nucleotide resolution. Using CAGE (cap analysis of gene expression) analysis of 16 HCC/nontumor liver pairs, we identify 17 robust TSSs, including a novel promoter for the X gene located in the middle of the gene body, which potentially produces a shorter X protein translated from the conserved second start codon, and two minor antisense transcripts that might represent viral noncoding RNAs. Interestingly, transcription profiles were similar in HCC and nontumor livers, although quantitative analysis revealed highly variable patterns of TSS usage among clinical samples, reflecting precise regulation of HBV transcription initiation at each promoter. Unlike the variety of TSSs found in liver and HCC, the vast majority of transcripts detected in HBV-positive blood samples are pregenomic RNA, most likely generated and released from liver. Our quantitative TSS mapping using the CAGE technology will allow better understanding of HBV transcriptional responses in further studies aimed at eradicating HBV in chronic carriers. IMPORTANCE: Despite the availability of a safe and effective vaccine, HBV infection remains a global health problem, and current antiviral protocols are not able to eliminate the virus in chronic carriers. Previous studies of the regulation of HBV transcription have described four major promoters and two enhancers, but little is known about their activity in human livers and HCC. We deeply sequenced the HBV RNA 5' ends in clinical human samples and experimental models by using a new, sensitive and quantitative method termed cap analysis of gene expression (CAGE). Our data provide the first comprehensive map of global TSS distribution over the entire HBV genome in the human liver, validating already known promoters and identifying novel locations. Better knowledge of HBV transcriptional activity in the clinical setting has critical implications in the evaluation of therapeutic approaches that target HBV replication.
Asunto(s)
Carcinoma Hepatocelular/virología , Virus de la Hepatitis B/genética , Hepatitis B Crónica/virología , Neoplasias Hepáticas/virología , Regiones Promotoras Genéticas , Adulto , Anciano , Animales , Mapeo Cromosómico , Femenino , Genoma Viral , Células Hep G2 , Virus de la Hepatitis B/patogenicidad , Humanos , Hígado/virología , Masculino , Ratones , Persona de Mediana Edad , Caperuzas de ARN/genética , ARN Viral/genética , Sitio de Iniciación de la Transcripción , TranscriptomaRESUMEN
Multidrug resistance 2 (Mdr2), also called adenosine triphosphate-binding cassette B4 (ABCB4), is the transporter of phosphatidylcholine (PC) at the canalicular membrane of mouse hepatocytes, which plays an essential role for bile formation. Mutations in human homologue MDR3 are associated with several liver diseases. Knockout of Mdr2 results in hepatic inflammation, liver fibrosis and hepatocellular carcinoma (HCC). Whereas the pathogenesis in Mdr2 (-/-) mice has been largely attributed to the toxicity of bile acids due to the absence of PC in the bile, the question of whether Mdr2 deficiency per se perturbs biological functions in the cell has been poorly addressed. As Mdr2 is expressed in many cell types, we used mouse embryonic fibroblasts (MEF) derived from Mdr2 (-/-) embryos to show that deficiency of Mdr2 increases reactive oxygen species accumulation, lipid peroxidation and DNA damage. We found that Mdr2 (-/-) MEFs undergo spontaneous transformation and that Mdr2 (-/-) mice are more susceptible to chemical carcinogen-induced intestinal tumorigenesis. Microarray analysis in Mdr2-/- MEFs and cap analysis of gene expression in Mdr2 (-/-) HCCs revealed extensively deregulated genes involved in oxidation reduction, fatty acid metabolism and lipid biosynthesis. Our findings imply a close link between Mdr2 (-/-) -associated tumorigenesis and perturbation of these biological processes and suggest potential extrahepatic functions of Mdr2/MDR3.
Asunto(s)
Subfamilia B de Transportador de Casetes de Unión a ATP/deficiencia , Transformación Celular Neoplásica/metabolismo , Estrés Oxidativo/fisiología , Subfamilia B de Transportador de Casetes de Unión a ATP/genética , Subfamilia B de Transportador de Casetes de Unión a ATP/metabolismo , Poliposis Adenomatosa del Colon/metabolismo , Poliposis Adenomatosa del Colon/patología , Animales , Apoptosis/efectos de los fármacos , Apoptosis/fisiología , Transformación Celular Neoplásica/genética , Transformación Celular Neoplásica/patología , Células Cultivadas , Daño del ADN , Femenino , Fibroblastos/metabolismo , Fibroblastos/patología , Neoplasias Intestinales/metabolismo , Neoplasias Intestinales/patología , Peroxidación de Lípido , Hígado/metabolismo , Hígado/patología , Masculino , Ratones , Ratones Endogámicos BALB C , Ratones Noqueados , Ratones Desnudos , Especies Reactivas de Oxígeno/metabolismo , Miembro 4 de la Subfamilia B de Casete de Unión a ATPRESUMEN
Spatiotemporal control of gene expression is central to animal development. Core promoters represent a previously unanticipated regulatory level by interacting with cis-regulatory elements and transcription initiation in different physiological and developmental contexts. Here, we provide a first and comprehensive description of the core promoter repertoire and its dynamic use during the development of a vertebrate embryo. By using cap analysis of gene expression (CAGE), we mapped transcription initiation events at single nucleotide resolution across 12 stages of zebrafish development. These CAGE-based transcriptome maps reveal genome-wide rules of core promoter usage, structure, and dynamics, key to understanding the control of gene regulation during vertebrate ontogeny. They revealed the existence of multiple classes of pervasive intra- and intergenic post-transcriptionally processed RNA products and their developmental dynamics. Among these RNAs, we report splice donor site-associated intronic RNA (sRNA) to be specific to genes of the splicing machinery. For the identification of conserved features, we compared the zebrafish data sets to the first CAGE promoter map of Tetraodon and the existing human CAGE data. We show that a number of features, such as promoter type, newly discovered promoter properties such as a specialized purine-rich initiator motif, as well as sRNAs and the genes in which they are detected, are conserved in mammalian and Tetraodon CAGE-defined promoter maps. The zebrafish developmental promoterome represents a powerful resource for studying developmental gene regulation and revealing promoter features shared across vertebrates.
Asunto(s)
Desarrollo Embrionario/genética , Regulación del Desarrollo de la Expresión Génica , Purinas/metabolismo , Sitio de Iniciación de la Transcripción , Pez Cebra/embriología , Pez Cebra/genética , Animales , Evolución Molecular , Perfilación de la Expresión Génica , Genes , Genoma , Filogenia , Regiones Promotoras Genéticas , ARN/genética , ARN/metabolismo , Caperuzas de ARN/genética , Empalme del ARN , Transcriptoma , Vertebrados/genéticaRESUMEN
Template switching (TS) has been an inherent mechanism of reverse transcriptase, which has been exploited in several transcriptome analysis methods, such as CAGE, RNA-Seq and short RNA sequencing. TS is an attractive option, given the simplicity of the protocol, which does not require an adaptor mediated step and thus minimizes sample loss. As such, it has been used in several studies that deal with limited amounts of RNA, such as in single cell studies. Additionally, TS has also been used to introduce DNA barcodes or indexes into different samples, cells or molecules. This labeling allows one to pool several samples into one sequencing flow cell, increasing the data throughput of sequencing and takes advantage of the increasing throughput of current sequences. Here, we report TS artifacts that form owing to a process called strand invasion. Due to the way in which barcodes/indexes are introduced by TS, strand invasion becomes more problematic by introducing unsystematic biases. We describe a strategy that eliminates these artifacts in silico and propose an experimental solution that suppresses biases from TS.
Asunto(s)
Artefactos , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Animales , Humanos , Ratones , ARN/sangre , ARN/química , Ratas , Sensibilidad y Especificidad , Moldes GenéticosRESUMEN
Variations in transcription start site (TSS) selection reflect diversity of preinitiation complexes and can impact on post-transcriptional RNA fates. Most metazoan polymerase II-transcribed genes carry canonical initiation with pyrimidine/purine (YR) dinucleotide, while translation machinery-associated genes carry polypyrimidine initiator (5'-TOP or TCT). By addressing the developmental regulation of TSS selection in zebrafish we uncovered a class of dual-initiation promoters in thousands of genes, including snoRNA host genes. 5'-TOP/TCT initiation is intertwined with canonical initiation and used divergently in hundreds of dual-initiation promoters during maternal to zygotic transition. Dual-initiation in snoRNA host genes selectively generates host and snoRNA with often different spatio-temporal expression. Dual-initiation promoters are pervasive in human and fruit fly, reflecting evolutionary conservation. We propose that dual-initiation on shared promoters represents a composite promoter architecture, which can function both coordinately and divergently to diversify RNAs.
Asunto(s)
Regulación del Desarrollo de la Expresión Génica , Redes Reguladoras de Genes , Regiones Promotoras Genéticas/genética , Sitio de Iniciación de la Transcripción , Transcripción Genética , Animales , Secuencia de Bases , Drosophila/genética , Drosophila/crecimiento & desarrollo , Humanos , ARN/genética , ARN/fisiología , ARN Nucleolar Pequeño/genética , ARN Nucleolar Pequeño/metabolismo , ARN Pequeño no Traducido/genética , ARN Pequeño no Traducido/fisiología , ARN no Traducido/genética , ARN no Traducido/fisiología , Elementos Reguladores de la Transcripción , Pez Cebra/genética , Pez Cebra/crecimiento & desarrollo , CigotoRESUMEN
Mammalian genomes encode tens of thousands of noncoding RNAs. Most noncoding transcripts exhibit nuclear localization and several have been shown to play a role in the regulation of gene expression and chromatin remodeling. To investigate the function of such RNAs, methods to massively map the genomic interacting sites of multiple transcripts have been developed; however, these methods have some limitations. Here, we introduce RNA And DNA Interacting Complexes Ligated and sequenced (RADICL-seq), a technology that maps genome-wide RNA-chromatin interactions in intact nuclei. RADICL-seq is a proximity ligation-based methodology that reduces the bias for nascent transcription, while increasing genomic coverage and unique mapping rate efficiency compared with existing methods. RADICL-seq identifies distinct patterns of genome occupancy for different classes of transcripts as well as cell type-specific RNA-chromatin interactions, and highlights the role of transcription in the establishment of chromatin structure.
Asunto(s)
Cromatina/metabolismo , Mapeo Cromosómico/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN no Traducido/genética , Análisis de Secuencia de ARN/métodos , Animales , Línea Celular , Núcleo Celular/genética , Núcleo Celular/metabolismo , Cromatina/genética , Ensamble y Desensamble de Cromatina/genética , Biblioteca de Genes , Ratones , Células Madre Embrionarias de Ratones , ARN no Traducido/metabolismo , Transcripción GenéticaRESUMEN
Facultative heterochromatin forms and reorganizes in response to external stimuli. However, how the initial establishment of such a chromatin state is regulated in cell-cycle-arrested cells remains unexplored. Mouse gonocytes are arrested male germ cells, at which stage the genome-wide DNA methylome forms. Here, we discovered transiently accessible heterochromatin domains of several megabases in size in gonocytes and named them differentially accessible domains (DADs). Open DADs formed in gene desert and gene cluster regions, primarily at transposons, with the reprogramming of histone marks, suggesting DADs as facultative heterochromatin. De novo DNA methylation took place with two waves in gonocytes: the first region specific and the second genome-wide. DADs were resistant to the first wave and their opening preceded the second wave. In addition, the higher-order chromosome architecture was reorganized with less defined chromosome compartments in gonocytes. These findings suggest that multiple layers of chromatin reprogramming facilitate de novo DNA methylation.
Asunto(s)
Metilación de ADN , Células Germinativas/química , Heterocromatina/química , Testículo/embriología , Animales , Ciclo Celular , Cromatina/química , Cromosomas , Genoma , Histonas/química , Masculino , Ratones , Ratones Endogámicos C57BLRESUMEN
Identification of important, functional small RNA (sRNA) species is currently hampered by the lack of reliable and sensitive methods to isolate and characterize them. We have developed a method, termed target-enrichment of sRNAs (TEsR), that enables targeted sequencing of rare sRNAs and diverse precursor and mature forms of sRNAs not detectable by current standard sRNA sequencing methods. It is based on the amplification of full-length sRNA molecules, production of biotinylated RNA probes, hybridization to one or multiple targeted RNAs, removal of nontargeted sRNAs and sequencing. By this approach, target sRNAs can be enriched by a factor of 500-30,000 while maintaining strand specificity. TEsR enriches for sRNAs irrespective of length or different molecular features, such as the presence or absence of a 5' cap or of secondary structures or abundance levels. Moreover, TEsR allows the detection of the complete sequence (including sequence variants, and 5' and 3' ends) of precursors, as well as intermediate and mature forms, in a quantitative manner. A well-trained molecular biologist can complete the TEsR procedure, from RNA extraction to sequencing library preparation, within 4-6 d.
Asunto(s)
Técnicas de Amplificación de Ácido Nucleico/métodos , ARN Pequeño no Traducido/genética , ARN Pequeño no Traducido/aislamiento & purificación , Análisis de Secuencia de ARN/métodosRESUMEN
O sucesso da prática anestésica e das intervenções cirúrgicas realizadas na mandíbula depende, entre outros fatores, da localização correta dos canais mandibulares. Entretanto, podem existir canais mandibulares acessórios e/ou caminhos alternativos seguidos pelo nervo alveolar inferior no corpo e ramo mandibulares, diminuindo bastante a porcentagem de tal sucesso. Poucos profissionais têm conhecimento dessas variações que, quando não são identificadas previamente à analgesia e intervenções cirúrgicas, dificultam esses procedimentos, inclusive com o desencadeamento de hemorragias inesperadas. O presente trabalho apresenta uma revisão da literatura e relato de caso sobre o tema, buscando alertar sobre a existência dos canais mandibulares acessórios e facilitando sua identificação. Serão abordadas a ocorrência, freqüência e configuração dos mesmos e suas classificações, de acordo com os autores que os identificaram. Também serão apresentadas alternativas de abordagem anestésicas e cirúrgicas, fora do padrão comumente encontrado, que podem ser utilizadas nesses casos específicos