Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Nucleic Acids Res ; 51(9): 4191-4207, 2023 05 22.
Artículo en Inglés | MEDLINE | ID: mdl-37026479

RESUMEN

Adenosine deaminase acting on RNA ADAR1 promotes A-to-I conversion in double-stranded and structured RNAs. ADAR1 has two isoforms transcribed from different promoters: cytoplasmic ADAR1p150 is interferon-inducible while ADAR1p110 is constitutively expressed and primarily localized in the nucleus. Mutations in ADAR1 cause Aicardi - Goutières syndrome (AGS), a severe autoinflammatory disease associated with aberrant IFN production. In mice, deletion of ADAR1 or the p150 isoform leads to embryonic lethality driven by overexpression of interferon-stimulated genes. This phenotype is rescued by deletion of the cytoplasmic dsRNA-sensor MDA5 indicating that the p150 isoform is indispensable and cannot be rescued by ADAR1p110. Nevertheless, editing sites uniquely targeted by ADAR1p150 remain elusive. Here, by transfection of ADAR1 isoforms into ADAR-less mouse cells we detect isoform-specific editing patterns. Using mutated ADAR variants, we test how intracellular localization and the presence of a Z-DNA binding domain-α affect editing preferences. These data show that ZBDα only minimally contributes to p150 editing-specificity while isoform-specific editing is primarily directed by the intracellular localization of ADAR1 isoforms. Our study is complemented by RIP-seq on human cells ectopically expressing tagged-ADAR1 isoforms. Both datasets reveal enrichment of intronic editing and binding by ADAR1p110 while ADAR1p150 preferentially binds and edits 3'UTRs.


Asunto(s)
Adenosina Desaminasa , Interferones , Edición de ARN , ARN Bicatenario , Animales , Humanos , Ratones , Adenosina Desaminasa/genética , Adenosina Desaminasa/metabolismo , Núcleo Celular/metabolismo , Citoplasma/metabolismo , Interferones/genética , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , ARN Bicatenario/genética
2.
PLoS Genet ; 18(8): e1010376, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35994477

RESUMEN

The class I histone deacetylases are essential regulators of cell fate decisions in health and disease. While pan- and class-specific HDAC inhibitors are available, these drugs do not allow a comprehensive understanding of individual HDAC function, or the therapeutic potential of isoform-specific targeting. To systematically compare the impact of individual catalytic functions of HDAC1, HDAC2 and HDAC3, we generated human HAP1 cell lines expressing catalytically inactive HDAC enzymes. Using this genetic toolbox we compare the effect of individual HDAC inhibition with the effects of class I specific inhibitors on cell viability, protein acetylation and gene expression. Individual inactivation of HDAC1 or HDAC2 has only mild effects on cell viability, while HDAC3 inactivation or loss results in DNA damage and apoptosis. Inactivation of HDAC1/HDAC2 led to increased acetylation of components of the COREST co-repressor complex, reduced deacetylase activity associated with this complex and derepression of neuronal genes. HDAC3 controls the acetylation of nuclear hormone receptor associated proteins and the expression of nuclear hormone receptor regulated genes. Acetylation of specific histone acetyltransferases and HDACs is sensitive to inactivation of HDAC1/HDAC2. Over a wide range of assays, we determined that in particular HDAC1 or HDAC2 catalytic inactivation mimics class I specific HDAC inhibitors. Importantly, we further demonstrate that catalytic inactivation of HDAC1 or HDAC2 sensitizes cells to specific cancer drugs. In summary, our systematic study revealed isoform-specific roles of HDAC1/2/3 catalytic functions. We suggest that targeted genetic inactivation of particular isoforms effectively mimics pharmacological HDAC inhibition allowing the identification of relevant HDACs as targets for therapeutic intervention.


Asunto(s)
Histona Desacetilasa 1 , Inhibidores de Histona Desacetilasas , Acetilación , Histona Desacetilasa 1/genética , Histona Desacetilasa 1/metabolismo , Histona Desacetilasa 2/genética , Histona Desacetilasa 2/metabolismo , Inhibidores de Histona Desacetilasas/farmacología , Histona Desacetilasas/genética , Histona Desacetilasas/metabolismo , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo
3.
Bioinformatics ; 37(15): 2126-2133, 2021 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-33538792

RESUMEN

MOTIVATION: Predicting the folding dynamics of RNAs is a computationally difficult problem, first and foremost due to the combinatorial explosion of alternative structures in the folding space. Abstractions are therefore needed to simplify downstream analyses, and thus make them computationally tractable. This can be achieved by various structure sampling algorithms. However, current sampling methods are still time consuming and frequently fail to represent key elements of the folding space. METHOD: We introduce RNAxplorer, a novel adaptive sampling method to efficiently explore the structure space of RNAs. RNAxplorer uses dynamic programming to perform an efficient Boltzmann sampling in the presence of guiding potentials, which are accumulated into pseudo-energy terms and reflect similarity to already well-sampled structures. This way, we effectively steer sampling toward underrepresented or unexplored regions of the structure space. RESULTS: We developed and applied different measures to benchmark our sampling methods against its competitors. Most of the measures show that RNAxplorer produces more diverse structure samples, yields rare conformations that may be inaccessible to other sampling methods and is better at finding the most relevant kinetic traps in the landscape. Thus, it produces a more representative coarse graining of the landscape, which is well suited to subsequently compute better approximations of RNA folding kinetics. AVAILABILITYAND IMPLEMENTATION: https://github.com/ViennaRNA/RNAxplorer/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

4.
Plant Physiol ; 180(1): 305-322, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30760640

RESUMEN

Cis-Natural Antisense Transcripts (cis-NATs), which overlap protein coding genes and are transcribed from the opposite DNA strand, constitute an important group of noncoding RNAs. Whereas several examples of cis-NATs regulating the expression of their cognate sense gene are known, most cis-NATs function by altering the steady-state level or structure of mRNA via changes in transcription, mRNA stability, or splicing, and very few cases involve the regulation of sense mRNA translation. This study was designed to systematically search for cis-NATs influencing cognate sense mRNA translation in Arabidopsis (Arabidopsis thaliana). Establishment of a pipeline relying on sequencing of total polyA+ and polysomal RNA from Arabidopsis grown under various conditions (i.e. nutrient deprivation and phytohormone treatments) allowed the identification of 14 cis-NATs whose expression correlated either positively or negatively with cognate sense mRNA translation. With use of a combination of cis-NAT stable over-expression in transgenic plants and transient expression in protoplasts, the impact of cis-NAT expression on mRNA translation was confirmed for 4 out of 5 tested cis-NAT:sense mRNA pairs. These results expand the number of cis-NATs known to regulate cognate sense mRNA translation and provide a foundation for future studies of their mode of action. Moreover, this study highlights the role of this class of noncoding RNAs in translation regulation.


Asunto(s)
Arabidopsis/genética , Biosíntesis de Proteínas , ARN sin Sentido/genética , Proteínas de Arabidopsis/genética , Proteínas de Unión al ADN/genética , Regulación de la Expresión Génica de las Plantas , Plantas Modificadas Genéticamente , ARN Mensajero/genética , ARN de Planta , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN , Factores de Transcripción/genética
5.
Methods ; 156: 32-39, 2019 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-30385321

RESUMEN

Chemical modifications of RNA nucleotides change their identity and characteristics and thus alter genetic and structural information encoded in the genomic DNA. tRNA and rRNA are probably the most heavily modified genes, and often depend on derivatization or isomerization of their nucleobases in order to correctly fold into their functional structures. Recent RNomics studies, however, report transcriptome wide RNA modification and suggest a more general regulation of structuredness of RNAs by this so called epitranscriptome. Modification seems to require specific substrate structures, which in turn are stabilized or destabilized and thus promote or inhibit refolding events of regulatory RNA structures. In this review, we revisit RNA modifications and the related structures from a computational point of view. We discuss known substrate structures, their properties such as sub-motifs as well as consequences of modifications on base pairing patterns and possible refolding events. Given that efficient RNA structure prediction methods for canonical base pairs have been established several decades ago, we review to what extend these methods allow the inclusion of modified nucleotides to model and study epitranscriptomic effects on RNA structures.


Asunto(s)
Adenosina/metabolismo , Inosina/metabolismo , Procesamiento Postranscripcional del ARN , Análisis de Secuencia de ARN/métodos , Transcriptoma , Animales , Emparejamiento Base , Secuencia de Bases , Humanos , Metilación , MicroARNs/genética , MicroARNs/metabolismo , Conformación de Ácido Nucleico , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN Ribosómico/genética , ARN Ribosómico/metabolismo , ARN Nuclear Pequeño/genética , ARN Nuclear Pequeño/metabolismo , ARN de Transferencia/genética , ARN de Transferencia/metabolismo
6.
Genes (Basel) ; 9(8)2018 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-30071678

RESUMEN

In this work, we present a computational screen conducted for functional RNA structures, resulting in over 100,000 conserved RNA structure elements found in alignments of mouse (mm10) against 59 other vertebrates. We explicitly included masked repeat regions to explore the potential of transposable elements and low-complexity regions to give rise to regulatory RNA elements. In our analysis pipeline, we implemented a four-step procedure: (i) we screened genome-wide alignments for potential structure elements using RNAz-2, (ii) realigned and refined candidate loci with LocARNA-P, (iii) scored candidates again with RNAz-2 in structure alignment mode, and (iv) searched for additional homologous loci in mouse genome that were not covered by genome alignments. The 3'-untranslated regions (3'-UTRs) of protein-coding genes and small noncoding RNAs are enriched for structures, while coding sequences are depleted. Repeat-associated loci make up about 95% of the homologous loci identified and are, as expected, predominantly found in intronic and intergenic regions. Nevertheless, we report the structure elements enriched in specific genome elements, such as 3'-UTRs and long noncoding RNAs (lncRNAs). We provide full access to our results via a custom UCSC genome browser trackhub freely available on our website (http://rna.tbi.univie.ac.at/trackhubs/#RNAz).

8.
Sci Rep ; 6: 34589, 2016 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-27713552

RESUMEN

The unprecedented outbreak of Ebola in West Africa resulted in over 28,000 cases and 11,000 deaths, underlining the need for a better understanding of the biology of this highly pathogenic virus to develop specific counter strategies. Two filoviruses, the Ebola and Marburg viruses, result in a severe and often fatal infection in humans. However, bats are natural hosts and survive filovirus infections without obvious symptoms. The molecular basis of this striking difference in the response to filovirus infections is not well understood. We report a systematic overview of differentially expressed genes, activity motifs and pathways in human and bat cells infected with the Ebola and Marburg viruses, and we demonstrate that the replication of filoviruses is more rapid in human cells than in bat cells. We also found that the most strongly regulated genes upon filovirus infection are chemokine ligands and transcription factors. We observed a strong induction of the JAK/STAT pathway, of several genes encoding inhibitors of MAP kinases (DUSP genes) and of PPP1R15A, which is involved in ER stress-induced cell death. We used comparative transcriptomics to provide a data resource that can be used to identify cellular responses that might allow bats to survive filovirus infections.


Asunto(s)
Ebolavirus/metabolismo , Regulación de la Expresión Génica , Fiebre Hemorrágica Ebola/metabolismo , Enfermedad del Virus de Marburg/metabolismo , Marburgvirus/metabolismo , Transducción de Señal , Transcripción Genética , Animales , Línea Celular Tumoral , Quirópteros , Humanos
9.
Genome Biol ; 17(1): 220, 2016 10 25.
Artículo en Inglés | MEDLINE | ID: mdl-27782844

RESUMEN

BACKGROUND: Short interspersed elements (SINEs) represent the most abundant group of non-long-terminal repeat transposable elements in mammalian genomes. In primates, Alu elements are the most prominent and homogenous representatives of SINEs. Due to their frequent insertion within or close to coding regions, SINEs have been suggested to play a crucial role during genome evolution. Moreover, Alu elements within mRNAs have also been reported to control gene expression at different levels. RESULTS: Here, we undertake a genome-wide analysis of insertion patterns of human Alus within transcribed portions of the genome. Multiple, nearby insertions of SINEs within one transcript are more abundant in tandem orientation than in inverted orientation. Indeed, analysis of transcriptome-wide expression levels of 15 ENCODE cell lines suggests a cis-repressive effect of inverted Alu elements on gene expression. Using reporter assays, we show that the negative effect of inverted SINEs on gene expression is independent of known sensors of double-stranded RNAs. Instead, transcriptional elongation seems impaired, leading to reduced mRNA levels. CONCLUSIONS: Our study suggests that there is a bias against multiple SINE insertions that can promote intramolecular base pairing within a transcript. Moreover, at a genome-wide level, mRNAs harboring inverted SINEs are less expressed than mRNAs harboring single or tandemly arranged SINEs. Finally, we demonstrate a novel mechanism by which inverted SINEs can impact on gene expression by interfering with RNA polymerase II.


Asunto(s)
ARN Polimerasa II/genética , Elementos de Nucleótido Esparcido Corto/genética , Transcripción Genética , Transcriptoma/genética , Elementos Alu/genética , Línea Celular , Evolución Molecular , Regulación de la Expresión Génica , Genoma Humano , Humanos , ARN Bicatenario/genética , ARN Mensajero/genética
10.
Nat Commun ; 7: 12339, 2016 08 17.
Artículo en Inglés | MEDLINE | ID: mdl-27531712

RESUMEN

Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking. Here we describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. We apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5' or 3', often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, we show that RACE-Seq is an effective tool to annotate an organism's deep transcriptome, and compares favourably to other targeted sequencing techniques.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reacción en Cadena de la Polimerasa/métodos , ARN Largo no Codificante/genética , Análisis de Secuencia de ARN/métodos , Exones/genética , Sitios Genéticos , Humanos , Anotación de Secuencia Molecular , Especificidad de Órganos/genética , Prueba de Estudio Conceptual , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Sitios de Empalme de ARN/genética , ARN Largo no Codificante/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Transcriptoma/genética
11.
Mol Syst Biol ; 12(5): 868, 2016 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-27178967

RESUMEN

Precise regulation of mRNA decay is fundamental for robust yet not exaggerated inflammatory responses to pathogens. However, a global model integrating regulation and functional consequences of inflammation-associated mRNA decay remains to be established. Using time-resolved high-resolution RNA binding analysis of the mRNA-destabilizing protein tristetraprolin (TTP), an inflammation-limiting factor, we qualitatively and quantitatively characterize TTP binding positions in the transcriptome of immunostimulated macrophages. We identify pervasive destabilizing and non-destabilizing TTP binding, including a robust intronic binding, showing that TTP binding is not sufficient for mRNA destabilization. A low degree of flanking RNA structuredness distinguishes occupied from silent binding motifs. By functionally relating TTP binding sites to mRNA stability and levels, we identify a TTP-controlled switch for the transition from inflammatory into the resolution phase of the macrophage immune response. Mapping of binding positions of the mRNA-stabilizing protein HuR reveals little target and functional overlap with TTP, implying a limited co-regulation of inflammatory mRNA decay by these proteins. Our study establishes a functionally annotated and navigable transcriptome-wide atlas (http://ttp-atlas.univie.ac.at) of cis-acting elements controlling mRNA decay in inflammation.


Asunto(s)
Lipopolisacáridos/farmacología , Macrófagos/inmunología , ARN Mensajero/química , Tristetraprolina/metabolismo , Animales , Sitios de Unión , Células Cultivadas , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Células HEK293 , Humanos , Macrófagos/efectos de los fármacos , Ratones , Estabilidad del ARN , ARN Mensajero/metabolismo , Análisis de Secuencia de ARN
12.
Methods ; 103: 86-98, 2016 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-27064083

RESUMEN

RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information.


Asunto(s)
ARN/química , Algoritmos , Secuencia de Bases , Biología Computacional , Simulación por Computador , Humanos , Modelos Moleculares , Pliegue del ARN , Análisis de Secuencia de ARN
13.
Nucleic Acids Res ; 44(D1): D90-5, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26602692

RESUMEN

AREsite2 represents an update for AREsite, an on-line resource for the investigation of AU-rich elements (ARE) in human and mouse mRNA 3'UTR sequences. The new updated and enhanced version allows detailed investigation of AU, GU and U-rich elements (ARE, GRE, URE) in the transcriptome of Homo sapiens, Mus musculus, Danio rerio, Caenorhabditis elegans and Drosophila melanogaster. It contains information on genomic location, genic context, RNA secondary structure context and conservation of annotated motifs. Improvements include annotation of motifs not only in 3'UTRs but in the whole gene body including introns, additional genomes, and locally stable secondary structures from genome wide scans. Furthermore, we include data from CLIP-Seq experiments in order to highlight motifs with validated protein interaction. Additionally, we provide a REST interface for experienced users to interact with the database in a semi-automated manner. The database is publicly available at: http://rna.tbi.univie.ac.at/AREsite.


Asunto(s)
Regiones no Traducidas 3' , Bases de Datos de Ácidos Nucleicos , ARN/química , Animales , Genómica , Humanos , Ratones , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , Motivos de Nucleótidos
14.
Nat Commun ; 6: 5903, 2015 Jan 13.
Artículo en Inglés | MEDLINE | ID: mdl-25582907

RESUMEN

Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals substantial conservation of transcriptional programmes, and uncovers a distinct class of genes with levels of expression that have been constrained early in vertebrate evolution. This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.


Asunto(s)
Evolución Molecular , Regulación de la Expresión Génica , Transcriptoma , Empalme Alternativo , Animales , Evolución Biológica , Línea Celular , Epigénesis Genética , Perfilación de la Expresión Génica , Biblioteca de Genes , Genoma , Histonas/química , Humanos , Ratones , Ratones Endogámicos C57BL , Modelos Genéticos , Oligonucleótidos Antisentido , Fenotipo , Análisis de Secuencia de ARN
15.
Nature ; 515(7527): 355-64, 2014 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-25409824

RESUMEN

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.


Asunto(s)
Genoma/genética , Genómica , Ratones/genética , Anotación de Secuencia Molecular , Animales , Linaje de la Célula/genética , Cromatina/genética , Cromatina/metabolismo , Secuencia Conservada/genética , Replicación del ADN/genética , Desoxirribonucleasa I/metabolismo , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Estudio de Asociación del Genoma Completo , Humanos , ARN/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Especificidad de la Especie , Factores de Transcripción/metabolismo , Transcriptoma/genética
16.
Genome Biol ; 15(2): R34, 2014 Feb 10.
Artículo en Inglés | MEDLINE | ID: mdl-24512684

RESUMEN

Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from single-end cDNA sequences. In contrast to other methods, our approach accommodates multi-junction structures. Our method compares favorably with competing tools for conventionally spliced mRNAs and, with a gain of up to 40% of recall, systematically outperforms them on reads with multiple splits, trans-splicing and circular products. The algorithm is integrated into our mapping tool segemehl (http://www.bioinf.uni-leipzig.de/Software/segemehl/).


Asunto(s)
Algoritmos , Empalme del ARN/genética , ARN/genética , Trans-Empalme/genética , ADN Complementario/genética , Secuenciación de Nucleótidos de Alto Rendimiento , ARN Circular , ARN Mensajero/metabolismo , Programas Informáticos
17.
Artículo en Inglés | MEDLINE | ID: mdl-24334379

RESUMEN

G-quadruplexes are abundant locally stable structural elements in nucleic acids. The combinatorial theory of RNA structures and the dynamic programming algorithms for RNA secondary structure prediction are extended here to incorporate G-quadruplexes using a simple but plausible energy model. With preliminary energy parameters, we find that the overwhelming majority of putative quadruplex-forming sequences in the human genome are likely to fold into canonical secondary structures instead. Stable G-quadruplexes are strongly enriched, however, in the 5'UTR of protein coding mRNAs.


Asunto(s)
G-Cuádruplex , Conformación de Ácido Nucleico , ARN Mensajero/química , Regiones no Traducidas 5' , Secuencia de Bases , Biología Computacional , Humanos , Pliegue del ARN , ARN Mensajero/genética , ARN Mensajero/metabolismo , Alineación de Secuencia , Análisis de Secuencia de ARN , Termodinámica
18.
Genome Res ; 22(9): 1760-74, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22955987

RESUMEN

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Genómica/métodos , Anotación de Secuencia Molecular , Animales , Biología Computacional/métodos , ADN Complementario/química , ADN Complementario/genética , Evolución Molecular , Exones , Sitios Genéticos , Humanos , Internet , Modelos Moleculares , Sistemas de Lectura Abierta , Seudogenes , Control de Calidad , Sitios de Empalme de ARN , ARN Largo no Codificante , Reproducibilidad de los Resultados , Regiones no Traducidas
19.
Genome Res ; 22(9): 1775-89, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22955988

RESUMEN

The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences-particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.


Asunto(s)
Bases de Datos Genéticas , ARN Largo no Codificante/genética , Empalme Alternativo , Animales , Núcleo Celular/genética , Núcleo Celular/metabolismo , Análisis por Conglomerados , Evolución Molecular , Exones , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Histonas/metabolismo , Humanos , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta , Especificidad de Órganos/genética , Primates/genética , Procesamiento Postranscripcional del ARN , Sitios de Empalme de ARN , ARN Mensajero/genética , Selección Genética , Transcripción Genética
20.
Nature ; 489(7414): 101-8, 2012 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-22955620

RESUMEN

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.


Asunto(s)
ADN/genética , Enciclopedias como Asunto , Genoma Humano/genética , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Transcripción Genética/genética , Transcriptoma/genética , Alelos , Línea Celular , ADN Intergénico/genética , Elementos de Facilitación Genéticos , Exones/genética , Perfilación de la Expresión Génica , Genes/genética , Genómica , Humanos , Poliadenilación/genética , Isoformas de Proteínas/genética , ARN/biosíntesis , ARN/genética , Edición de ARN/genética , Empalme del ARN/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Análisis de Secuencia de ARN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...