Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 51(9): 4191-4207, 2023 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-37026479

RESUMO

Adenosine deaminase acting on RNA ADAR1 promotes A-to-I conversion in double-stranded and structured RNAs. ADAR1 has two isoforms transcribed from different promoters: cytoplasmic ADAR1p150 is interferon-inducible while ADAR1p110 is constitutively expressed and primarily localized in the nucleus. Mutations in ADAR1 cause Aicardi - Goutières syndrome (AGS), a severe autoinflammatory disease associated with aberrant IFN production. In mice, deletion of ADAR1 or the p150 isoform leads to embryonic lethality driven by overexpression of interferon-stimulated genes. This phenotype is rescued by deletion of the cytoplasmic dsRNA-sensor MDA5 indicating that the p150 isoform is indispensable and cannot be rescued by ADAR1p110. Nevertheless, editing sites uniquely targeted by ADAR1p150 remain elusive. Here, by transfection of ADAR1 isoforms into ADAR-less mouse cells we detect isoform-specific editing patterns. Using mutated ADAR variants, we test how intracellular localization and the presence of a Z-DNA binding domain-α affect editing preferences. These data show that ZBDα only minimally contributes to p150 editing-specificity while isoform-specific editing is primarily directed by the intracellular localization of ADAR1 isoforms. Our study is complemented by RIP-seq on human cells ectopically expressing tagged-ADAR1 isoforms. Both datasets reveal enrichment of intronic editing and binding by ADAR1p110 while ADAR1p150 preferentially binds and edits 3'UTRs.


Assuntos
Adenosina Desaminase , Interferons , Edição de RNA , RNA de Cadeia Dupla , Animais , Humanos , Camundongos , Adenosina Desaminase/genética , Adenosina Desaminase/metabolismo , Núcleo Celular/metabolismo , Citoplasma/metabolismo , Interferons/genética , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA de Cadeia Dupla/genética
2.
PLoS Genet ; 18(8): e1010376, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35994477

RESUMO

The class I histone deacetylases are essential regulators of cell fate decisions in health and disease. While pan- and class-specific HDAC inhibitors are available, these drugs do not allow a comprehensive understanding of individual HDAC function, or the therapeutic potential of isoform-specific targeting. To systematically compare the impact of individual catalytic functions of HDAC1, HDAC2 and HDAC3, we generated human HAP1 cell lines expressing catalytically inactive HDAC enzymes. Using this genetic toolbox we compare the effect of individual HDAC inhibition with the effects of class I specific inhibitors on cell viability, protein acetylation and gene expression. Individual inactivation of HDAC1 or HDAC2 has only mild effects on cell viability, while HDAC3 inactivation or loss results in DNA damage and apoptosis. Inactivation of HDAC1/HDAC2 led to increased acetylation of components of the COREST co-repressor complex, reduced deacetylase activity associated with this complex and derepression of neuronal genes. HDAC3 controls the acetylation of nuclear hormone receptor associated proteins and the expression of nuclear hormone receptor regulated genes. Acetylation of specific histone acetyltransferases and HDACs is sensitive to inactivation of HDAC1/HDAC2. Over a wide range of assays, we determined that in particular HDAC1 or HDAC2 catalytic inactivation mimics class I specific HDAC inhibitors. Importantly, we further demonstrate that catalytic inactivation of HDAC1 or HDAC2 sensitizes cells to specific cancer drugs. In summary, our systematic study revealed isoform-specific roles of HDAC1/2/3 catalytic functions. We suggest that targeted genetic inactivation of particular isoforms effectively mimics pharmacological HDAC inhibition allowing the identification of relevant HDACs as targets for therapeutic intervention.


Assuntos
Histona Desacetilase 1 , Inibidores de Histona Desacetilases , Acetilação , Histona Desacetilase 1/genética , Histona Desacetilase 1/metabolismo , Histona Desacetilase 2/genética , Histona Desacetilase 2/metabolismo , Inibidores de Histona Desacetilases/farmacologia , Histona Desacetilases/genética , Histona Desacetilases/metabolismo , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo
3.
Bioinformatics ; 37(15): 2126-2133, 2021 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-33538792

RESUMO

MOTIVATION: Predicting the folding dynamics of RNAs is a computationally difficult problem, first and foremost due to the combinatorial explosion of alternative structures in the folding space. Abstractions are therefore needed to simplify downstream analyses, and thus make them computationally tractable. This can be achieved by various structure sampling algorithms. However, current sampling methods are still time consuming and frequently fail to represent key elements of the folding space. METHOD: We introduce RNAxplorer, a novel adaptive sampling method to efficiently explore the structure space of RNAs. RNAxplorer uses dynamic programming to perform an efficient Boltzmann sampling in the presence of guiding potentials, which are accumulated into pseudo-energy terms and reflect similarity to already well-sampled structures. This way, we effectively steer sampling toward underrepresented or unexplored regions of the structure space. RESULTS: We developed and applied different measures to benchmark our sampling methods against its competitors. Most of the measures show that RNAxplorer produces more diverse structure samples, yields rare conformations that may be inaccessible to other sampling methods and is better at finding the most relevant kinetic traps in the landscape. Thus, it produces a more representative coarse graining of the landscape, which is well suited to subsequently compute better approximations of RNA folding kinetics. AVAILABILITYAND IMPLEMENTATION: https://github.com/ViennaRNA/RNAxplorer/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

4.
Plant Physiol ; 180(1): 305-322, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30760640

RESUMO

Cis-Natural Antisense Transcripts (cis-NATs), which overlap protein coding genes and are transcribed from the opposite DNA strand, constitute an important group of noncoding RNAs. Whereas several examples of cis-NATs regulating the expression of their cognate sense gene are known, most cis-NATs function by altering the steady-state level or structure of mRNA via changes in transcription, mRNA stability, or splicing, and very few cases involve the regulation of sense mRNA translation. This study was designed to systematically search for cis-NATs influencing cognate sense mRNA translation in Arabidopsis (Arabidopsis thaliana). Establishment of a pipeline relying on sequencing of total polyA+ and polysomal RNA from Arabidopsis grown under various conditions (i.e. nutrient deprivation and phytohormone treatments) allowed the identification of 14 cis-NATs whose expression correlated either positively or negatively with cognate sense mRNA translation. With use of a combination of cis-NAT stable over-expression in transgenic plants and transient expression in protoplasts, the impact of cis-NAT expression on mRNA translation was confirmed for 4 out of 5 tested cis-NAT:sense mRNA pairs. These results expand the number of cis-NATs known to regulate cognate sense mRNA translation and provide a foundation for future studies of their mode of action. Moreover, this study highlights the role of this class of noncoding RNAs in translation regulation.


Assuntos
Arabidopsis/genética , Biossíntese de Proteínas , RNA Antissenso/genética , Proteínas de Arabidopsis/genética , Proteínas de Ligação a DNA/genética , Regulação da Expressão Gênica de Plantas , Plantas Geneticamente Modificadas , RNA Mensageiro/genética , RNA de Plantas , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Fatores de Transcrição/genética
5.
Methods ; 156: 32-39, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30385321

RESUMO

Chemical modifications of RNA nucleotides change their identity and characteristics and thus alter genetic and structural information encoded in the genomic DNA. tRNA and rRNA are probably the most heavily modified genes, and often depend on derivatization or isomerization of their nucleobases in order to correctly fold into their functional structures. Recent RNomics studies, however, report transcriptome wide RNA modification and suggest a more general regulation of structuredness of RNAs by this so called epitranscriptome. Modification seems to require specific substrate structures, which in turn are stabilized or destabilized and thus promote or inhibit refolding events of regulatory RNA structures. In this review, we revisit RNA modifications and the related structures from a computational point of view. We discuss known substrate structures, their properties such as sub-motifs as well as consequences of modifications on base pairing patterns and possible refolding events. Given that efficient RNA structure prediction methods for canonical base pairs have been established several decades ago, we review to what extend these methods allow the inclusion of modified nucleotides to model and study epitranscriptomic effects on RNA structures.


Assuntos
Adenosina/metabolismo , Inosina/metabolismo , Processamento Pós-Transcricional do RNA , Análise de Sequência de RNA/métodos , Transcriptoma , Animais , Pareamento de Bases , Sequência de Bases , Humanos , Metilação , MicroRNAs/genética , MicroRNAs/metabolismo , Conformação de Ácido Nucleico , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Ribossômico/genética , RNA Ribossômico/metabolismo , RNA Nuclear Pequeno/genética , RNA Nuclear Pequeno/metabolismo , RNA de Transferência/genética , RNA de Transferência/metabolismo
6.
Genes (Basel) ; 9(8)2018 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-30071678

RESUMO

In this work, we present a computational screen conducted for functional RNA structures, resulting in over 100,000 conserved RNA structure elements found in alignments of mouse (mm10) against 59 other vertebrates. We explicitly included masked repeat regions to explore the potential of transposable elements and low-complexity regions to give rise to regulatory RNA elements. In our analysis pipeline, we implemented a four-step procedure: (i) we screened genome-wide alignments for potential structure elements using RNAz-2, (ii) realigned and refined candidate loci with LocARNA-P, (iii) scored candidates again with RNAz-2 in structure alignment mode, and (iv) searched for additional homologous loci in mouse genome that were not covered by genome alignments. The 3'-untranslated regions (3'-UTRs) of protein-coding genes and small noncoding RNAs are enriched for structures, while coding sequences are depleted. Repeat-associated loci make up about 95% of the homologous loci identified and are, as expected, predominantly found in intronic and intergenic regions. Nevertheless, we report the structure elements enriched in specific genome elements, such as 3'-UTRs and long noncoding RNAs (lncRNAs). We provide full access to our results via a custom UCSC genome browser trackhub freely available on our website (http://rna.tbi.univie.ac.at/trackhubs/#RNAz).

8.
Genome Biol ; 17(1): 220, 2016 10 25.
Artigo em Inglês | MEDLINE | ID: mdl-27782844

RESUMO

BACKGROUND: Short interspersed elements (SINEs) represent the most abundant group of non-long-terminal repeat transposable elements in mammalian genomes. In primates, Alu elements are the most prominent and homogenous representatives of SINEs. Due to their frequent insertion within or close to coding regions, SINEs have been suggested to play a crucial role during genome evolution. Moreover, Alu elements within mRNAs have also been reported to control gene expression at different levels. RESULTS: Here, we undertake a genome-wide analysis of insertion patterns of human Alus within transcribed portions of the genome. Multiple, nearby insertions of SINEs within one transcript are more abundant in tandem orientation than in inverted orientation. Indeed, analysis of transcriptome-wide expression levels of 15 ENCODE cell lines suggests a cis-repressive effect of inverted Alu elements on gene expression. Using reporter assays, we show that the negative effect of inverted SINEs on gene expression is independent of known sensors of double-stranded RNAs. Instead, transcriptional elongation seems impaired, leading to reduced mRNA levels. CONCLUSIONS: Our study suggests that there is a bias against multiple SINE insertions that can promote intramolecular base pairing within a transcript. Moreover, at a genome-wide level, mRNAs harboring inverted SINEs are less expressed than mRNAs harboring single or tandemly arranged SINEs. Finally, we demonstrate a novel mechanism by which inverted SINEs can impact on gene expression by interfering with RNA polymerase II.


Assuntos
RNA Polimerase II/genética , Elementos Nucleotídeos Curtos e Dispersos/genética , Transcrição Gênica , Transcriptoma/genética , Elementos Alu/genética , Linhagem Celular , Evolução Molecular , Regulação da Expressão Gênica , Genoma Humano , Humanos , RNA de Cadeia Dupla/genética , RNA Mensageiro/genética
9.
Sci Rep ; 6: 34589, 2016 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-27713552

RESUMO

The unprecedented outbreak of Ebola in West Africa resulted in over 28,000 cases and 11,000 deaths, underlining the need for a better understanding of the biology of this highly pathogenic virus to develop specific counter strategies. Two filoviruses, the Ebola and Marburg viruses, result in a severe and often fatal infection in humans. However, bats are natural hosts and survive filovirus infections without obvious symptoms. The molecular basis of this striking difference in the response to filovirus infections is not well understood. We report a systematic overview of differentially expressed genes, activity motifs and pathways in human and bat cells infected with the Ebola and Marburg viruses, and we demonstrate that the replication of filoviruses is more rapid in human cells than in bat cells. We also found that the most strongly regulated genes upon filovirus infection are chemokine ligands and transcription factors. We observed a strong induction of the JAK/STAT pathway, of several genes encoding inhibitors of MAP kinases (DUSP genes) and of PPP1R15A, which is involved in ER stress-induced cell death. We used comparative transcriptomics to provide a data resource that can be used to identify cellular responses that might allow bats to survive filovirus infections.


Assuntos
Ebolavirus/metabolismo , Regulação da Expressão Gênica , Doença pelo Vírus Ebola/metabolismo , Doença do Vírus de Marburg/metabolismo , Marburgvirus/metabolismo , Transdução de Sinais , Transcrição Gênica , Animais , Linhagem Celular Tumoral , Quirópteros , Humanos
10.
Nat Commun ; 7: 12339, 2016 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-27531712

RESUMO

Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking. Here we describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. We apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5' or 3', often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, we show that RACE-Seq is an effective tool to annotate an organism's deep transcriptome, and compares favourably to other targeted sequencing techniques.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reação em Cadeia da Polimerase/métodos , RNA Longo não Codificante/genética , Análise de Sequência de RNA/métodos , Éxons/genética , Loci Gênicos , Humanos , Anotação de Sequência Molecular , Especificidade de Órgãos/genética , Estudo de Prova de Conceito , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Sítios de Splice de RNA/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transcriptoma/genética
11.
Mol Syst Biol ; 12(5): 868, 2016 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-27178967

RESUMO

Precise regulation of mRNA decay is fundamental for robust yet not exaggerated inflammatory responses to pathogens. However, a global model integrating regulation and functional consequences of inflammation-associated mRNA decay remains to be established. Using time-resolved high-resolution RNA binding analysis of the mRNA-destabilizing protein tristetraprolin (TTP), an inflammation-limiting factor, we qualitatively and quantitatively characterize TTP binding positions in the transcriptome of immunostimulated macrophages. We identify pervasive destabilizing and non-destabilizing TTP binding, including a robust intronic binding, showing that TTP binding is not sufficient for mRNA destabilization. A low degree of flanking RNA structuredness distinguishes occupied from silent binding motifs. By functionally relating TTP binding sites to mRNA stability and levels, we identify a TTP-controlled switch for the transition from inflammatory into the resolution phase of the macrophage immune response. Mapping of binding positions of the mRNA-stabilizing protein HuR reveals little target and functional overlap with TTP, implying a limited co-regulation of inflammatory mRNA decay by these proteins. Our study establishes a functionally annotated and navigable transcriptome-wide atlas (http://ttp-atlas.univie.ac.at) of cis-acting elements controlling mRNA decay in inflammation.


Assuntos
Lipopolissacarídeos/farmacologia , Macrófagos/imunologia , RNA Mensageiro/química , Tristetraprolina/metabolismo , Animais , Sítios de Ligação , Células Cultivadas , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Células HEK293 , Humanos , Macrófagos/efeitos dos fármacos , Camundongos , Estabilidade de RNA , RNA Mensageiro/metabolismo , Análise de Sequência de RNA
12.
Methods ; 103: 86-98, 2016 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-27064083

RESUMO

RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information.


Assuntos
RNA/química , Algoritmos , Sequência de Bases , Biologia Computacional , Simulação por Computador , Humanos , Modelos Moleculares , Dobramento de RNA , Análise de Sequência de RNA
13.
Nucleic Acids Res ; 44(D1): D90-5, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26602692

RESUMO

AREsite2 represents an update for AREsite, an on-line resource for the investigation of AU-rich elements (ARE) in human and mouse mRNA 3'UTR sequences. The new updated and enhanced version allows detailed investigation of AU, GU and U-rich elements (ARE, GRE, URE) in the transcriptome of Homo sapiens, Mus musculus, Danio rerio, Caenorhabditis elegans and Drosophila melanogaster. It contains information on genomic location, genic context, RNA secondary structure context and conservation of annotated motifs. Improvements include annotation of motifs not only in 3'UTRs but in the whole gene body including introns, additional genomes, and locally stable secondary structures from genome wide scans. Furthermore, we include data from CLIP-Seq experiments in order to highlight motifs with validated protein interaction. Additionally, we provide a REST interface for experienced users to interact with the database in a semi-automated manner. The database is publicly available at: http://rna.tbi.univie.ac.at/AREsite.


Assuntos
Regiões 3' não Traduzidas , Bases de Dados de Ácidos Nucleicos , RNA/química , Animais , Genômica , Humanos , Camundongos , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , Motivos de Nucleotídeos
14.
Nat Commun ; 6: 5903, 2015 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-25582907

RESUMO

Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals substantial conservation of transcriptional programmes, and uncovers a distinct class of genes with levels of expression that have been constrained early in vertebrate evolution. This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.


Assuntos
Evolução Molecular , Regulação da Expressão Gênica , Transcriptoma , Processamento Alternativo , Animais , Evolução Biológica , Linhagem Celular , Epigênese Genética , Perfilação da Expressão Gênica , Biblioteca Gênica , Genoma , Histonas/química , Humanos , Camundongos , Camundongos Endogâmicos C57BL , Modelos Genéticos , Oligonucleotídeos Antissenso , Fenótipo , Análise de Sequência de RNA
15.
Nature ; 515(7527): 355-64, 2014 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-25409824

RESUMO

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.


Assuntos
Genoma/genética , Genômica , Camundongos/genética , Anotação de Sequência Molecular , Animais , Linhagem da Célula/genética , Cromatina/genética , Cromatina/metabolismo , Sequência Conservada/genética , Replicação do DNA/genética , Desoxirribonuclease I/metabolismo , Regulação da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Estudo de Associação Genômica Ampla , Humanos , RNA/genética , Sequências Reguladoras de Ácido Nucleico/genética , Especificidade da Espécie , Fatores de Transcrição/metabolismo , Transcriptoma/genética
16.
Genome Biol ; 15(2): R34, 2014 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-24512684

RESUMO

Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from single-end cDNA sequences. In contrast to other methods, our approach accommodates multi-junction structures. Our method compares favorably with competing tools for conventionally spliced mRNAs and, with a gain of up to 40% of recall, systematically outperforms them on reads with multiple splits, trans-splicing and circular products. The algorithm is integrated into our mapping tool segemehl (http://www.bioinf.uni-leipzig.de/Software/segemehl/).


Assuntos
Algoritmos , Splicing de RNA/genética , RNA/genética , Trans-Splicing/genética , DNA Complementar/genética , Sequenciamento de Nucleotídeos em Larga Escala , RNA Circular , RNA Mensageiro/metabolismo , Software
17.
Artigo em Inglês | MEDLINE | ID: mdl-24334379

RESUMO

G-quadruplexes are abundant locally stable structural elements in nucleic acids. The combinatorial theory of RNA structures and the dynamic programming algorithms for RNA secondary structure prediction are extended here to incorporate G-quadruplexes using a simple but plausible energy model. With preliminary energy parameters, we find that the overwhelming majority of putative quadruplex-forming sequences in the human genome are likely to fold into canonical secondary structures instead. Stable G-quadruplexes are strongly enriched, however, in the 5'UTR of protein coding mRNAs.


Assuntos
Quadruplex G , Conformação de Ácido Nucleico , RNA Mensageiro/química , Regiões 5' não Traduzidas , Sequência de Bases , Biologia Computacional , Humanos , Dobramento de RNA , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Alinhamento de Sequência , Análise de Sequência de RNA , Termodinâmica
18.
Genome Biol ; 13(9): R51, 2012 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-22951037

RESUMO

BACKGROUND: Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent evidence suggests that many of them might have some form of biological activity, and the possibility of functionality has increased interest in their accurate annotation and integration with functional genomics data. RESULTS: As part of the GENCODE annotation of the human genome, we present the first genome-wide pseudogene assignment for protein-coding genes, based on both large-scale manual annotation and in silico pipelines. A key aspect of this coupled approach is that it allows us to identify pseudogenes in an unbiased fashion as well as untangle complex events through manual evaluation. We integrate the pseudogene annotations with the extensive ENCODE functional genomics information. In particular, we determine the expression level, transcription-factor and RNA polymerase II binding, and chromatin marks associated with each pseudogene. Based on their distribution, we develop simple statistical models for each type of activity, which we validate with large-scale RT-PCR-Seq experiments. Finally, we compare our pseudogenes with conservation and variation data from primate alignments and the 1000 Genomes project, producing lists of pseudogenes potentially under selection. CONCLUSIONS: At one extreme, some pseudogenes possess conventional characteristics of functionality; these may represent genes that have recently died. On the other hand, we find interesting patterns of partial activity, which may suggest that dead genes are being resurrected as functioning non-coding RNAs. The activity data of each pseudogene are stored in an associated resource, psiDR, which will be useful for the initial identification of potentially functional pseudogenes.


Assuntos
Genoma Humano , Pseudogenes , Transcrição Gênica , Animais , Sítios de Ligação , Cromatina/química , Cromatina/genética , Humanos , Modelos Genéticos , Modelos Estatísticos , Anotação de Sequência Molecular , Filogenia , Primatas , RNA Polimerase II/metabolismo , Sequências Reguladoras de Ácido Nucleico , Seleção Genética , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo
19.
Nature ; 489(7414): 101-8, 2012 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-22955620

RESUMO

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.


Assuntos
DNA/genética , Enciclopédias como Assunto , Genoma Humano/genética , Anotação de Sequência Molecular , Sequências Reguladoras de Ácido Nucleico/genética , Transcrição Gênica/genética , Transcriptoma/genética , Alelos , Linhagem Celular , DNA Intergênico/genética , Elementos Facilitadores Genéticos , Éxons/genética , Perfilação da Expressão Gênica , Genes/genética , Genômica , Humanos , Poliadenilação/genética , Isoformas de Proteínas/genética , RNA/biossíntese , RNA/genética , Edição de RNA/genética , Splicing de RNA/genética , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de RNA
20.
Genome Res ; 22(9): 1698-710, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22955982

RESUMO

Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.


Assuntos
Perfilação da Expressão Gênica/métodos , Genoma Humano , Transcriptoma , Biologia Computacional/métodos , Éxons , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Íntrons , Anotação de Sequência Molecular , Fases de Leitura Aberta , Isoformas de RNA , RNA Mensageiro/química , RNA Mensageiro/genética , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA