RESUMEN
Known protein coding gene exons compose less than 3% of the human genome. The remaining 97% is largely uncharted territory, with only a small fraction characterized. The recent observation of transcription in this intergenic territory has stimulated debate about the extent of intergenic transcription and whether these intergenic RNAs are functional. Here we directly observed with a large set of RNA-seq data covering a wide array of human tissue types that the majority of the genome is indeed transcribed, corroborating recent observations by the ENCODE project. Furthermore, using de novo transcriptome assembly of this RNA-seq data, we found that intergenic regions encode far more long intergenic noncoding RNAs (lincRNAs) than previously described, helping to resolve the discrepancy between the vast amount of observed intergenic transcription and the limited number of previously known lincRNAs. In total, we identified tens of thousands of putative lincRNAs expressed at a minimum of one copy per cell, significantly expanding upon prior lincRNA annotation sets. These lincRNAs are specifically regulated and conserved rather than being the product of transcriptional noise. In addition, lincRNAs are strongly enriched for trait-associated SNPs suggesting a new mechanism by which intergenic trait-associated regions may function. These findings will enable the discovery and interrogation of novel intergenic functional elements.
Asunto(s)
ADN Intergénico/genética , ARN Largo no Codificante/genética , Transcripción Genética , ADN Intergénico/aislamiento & purificación , Exones , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Polimorfismo de Nucleótido Simple , ARN Largo no Codificante/aislamiento & purificaciónRESUMEN
The pancreatic ß-cell is critical for the maintenance of glycemic control. Knowing the compendium of genes expressed in ß-cells will further our understanding of this critical cell type and may allow the identification of future antidiabetes drug targets. Here, we report the use of next-generation sequencing to obtain nearly 1 billion reads from the polyadenylated RNA of islets and purified ß-cells from mice. These data reveal novel examples of ß-cell-specific splicing events, promoter usage, and over 1000 long intergenic noncoding RNA expressed in mouse ß-cells. Many of these long intergenic noncoding RNA are ß-cell specific, and we hypothesize that this large set of novel RNA may play important roles in ß-cell function. Our data demonstrate unique features of the ß-cell transcriptome.