Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 8 de 8
1.
Genome Res ; 30(7): 1047-1059, 2020 07.
Article En | MEDLINE | ID: mdl-32759341

We have produced RNA sequencing data for 53 primary cells from different locations in the human body. The clustering of these primary cells reveals that most cells in the human body share a few broad transcriptional programs, which define five major cell types: epithelial, endothelial, mesenchymal, neural, and blood cells. These act as basic components of many tissues and organs. Based on gene expression, these cell types redefine the basic histological types by which tissues have been traditionally classified. We identified genes whose expression is specific to these cell types, and from these genes, we estimated the contribution of the major cell types to the composition of human tissues. We found this cellular composition to be a characteristic signature of tissues and to reflect tissue morphological heterogeneity and histology. We identified changes in cellular composition in different tissues associated with age and sex, and found that departures from the normal cellular composition correlate with histological phenotypes associated with disease.


Transcription, Genetic , Cell Line , Endothelial Cells/metabolism , Epithelial Cells/metabolism , Female , Gene Expression Profiling , Gynecomastia/genetics , Gynecomastia/metabolism , Humans , Male , Mesoderm/cytology , Mesoderm/metabolism , Neoplasms/genetics , Organ Specificity , Sequence Analysis, RNA
2.
Genome Res ; 29(11): 1900-1909, 2019 11.
Article En | MEDLINE | ID: mdl-31645363

MicroRNAs (miRNAs) play a critical role as posttranscriptional regulators of gene expression. The ENCODE Project profiled the expression of miRNAs in an extensive set of organs during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct organ-specific and developmental stage-specific miRNA expression clusters, with an overall pattern of increasing organ-specific expression as embryonic development proceeds. Comparative analysis of conserved miRNAs in mouse and human revealed stronger clustering of expression patterns by organ type rather than by species. An analysis of messenger RNA expression clusters compared with miRNA expression clusters identifies the potential role of specific miRNA expression clusters in suppressing the expression of mRNAs specific to other developmental programs in the organ in which these miRNAs are expressed during embryonic development. Our results provide the most comprehensive time-course of miRNA expression as part of an integrated ENCODE reference data set for mouse embryonic development.


Embryonic Development/genetics , MicroRNAs/genetics , Animals , Female , Gene Expression Regulation, Developmental , Mice , Pregnancy , RNA, Messenger/genetics
3.
Nat Commun ; 6: 5903, 2015 Jan 13.
Article En | MEDLINE | ID: mdl-25582907

Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals substantial conservation of transcriptional programmes, and uncovers a distinct class of genes with levels of expression that have been constrained early in vertebrate evolution. This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.


Evolution, Molecular , Gene Expression Regulation , Transcriptome , Alternative Splicing , Animals , Biological Evolution , Cell Line , Epigenesis, Genetic , Gene Expression Profiling , Gene Library , Genome , Histones/chemistry , Humans , Mice , Mice, Inbred C57BL , Models, Genetic , Oligonucleotides, Antisense , Phenotype , Sequence Analysis, RNA
4.
Nature ; 515(7527): 355-64, 2014 Nov 20.
Article En | MEDLINE | ID: mdl-25409824

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.


Genome/genetics , Genomics , Mice/genetics , Molecular Sequence Annotation , Animals , Cell Lineage/genetics , Chromatin/genetics , Chromatin/metabolism , Conserved Sequence/genetics , DNA Replication/genetics , Deoxyribonuclease I/metabolism , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Genome-Wide Association Study , Humans , RNA/genetics , Regulatory Sequences, Nucleic Acid/genetics , Species Specificity , Transcription Factors/metabolism , Transcriptome/genetics
5.
Bioinformatics ; 29(1): 15-21, 2013 Jan 01.
Article En | MEDLINE | ID: mdl-23104886

MOTIVATION: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. RESULTS: To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. AVAILABILITY AND IMPLEMENTATION: STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.


Sequence Alignment/methods , Software , Algorithms , Cluster Analysis , Gene Expression Profiling , Genome, Human , Humans , RNA Splicing , Sequence Analysis, RNA/methods
6.
Nature ; 489(7414): 101-8, 2012 Sep 06.
Article En | MEDLINE | ID: mdl-22955620

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.


DNA/genetics , Encyclopedias as Topic , Genome, Human/genetics , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid/genetics , Transcription, Genetic/genetics , Transcriptome/genetics , Alleles , Cell Line , DNA, Intergenic/genetics , Enhancer Elements, Genetic , Exons/genetics , Gene Expression Profiling , Genes/genetics , Genomics , Humans , Polyadenylation/genetics , Protein Isoforms/genetics , RNA/biosynthesis , RNA/genetics , RNA Editing/genetics , RNA Splicing/genetics , Repetitive Sequences, Nucleic Acid/genetics , Sequence Analysis, RNA
7.
Nature ; 471(7339): 473-9, 2011 Mar 24.
Article En | MEDLINE | ID: mdl-21179090

Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, prediction and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development.


Drosophila melanogaster/growth & development , Drosophila melanogaster/genetics , Gene Expression Profiling , Gene Expression Regulation, Developmental/genetics , Transcription, Genetic/genetics , Alternative Splicing/genetics , Animals , Base Sequence , Drosophila Proteins/genetics , Drosophila melanogaster/embryology , Exons/genetics , Female , Genes, Insect/genetics , Genome, Insect/genetics , Male , MicroRNAs/genetics , Oligonucleotide Array Sequence Analysis , Protein Isoforms/genetics , RNA Editing/genetics , RNA, Messenger/analysis , RNA, Messenger/genetics , RNA, Small Untranslated/analysis , RNA, Small Untranslated/genetics , Sequence Analysis , Sex Characteristics
8.
Genome Res ; 21(2): 301-14, 2011 Feb.
Article En | MEDLINE | ID: mdl-21177962

Drosophila melanogaster cell lines are important resources for cell biologists. Here, we catalog the expression of exons, genes, and unannotated transcriptional signals for 25 lines. Unannotated transcription is substantial (typically 19% of euchromatic signal). Conservatively, we identify 1405 novel transcribed regions; 684 of these appear to be new exons of neighboring, often distant, genes. Sixty-four percent of genes are expressed detectably in at least one line, but only 21% are detected in all lines. Each cell line expresses, on average, 5885 genes, including a common set of 3109. Expression levels vary over several orders of magnitude. Major signaling pathways are well represented: most differentiation pathways are "off" and survival/growth pathways "on." Roughly 50% of the genes expressed by each line are not part of the common set, and these show considerable individuality. Thirty-one percent are expressed at a higher level in at least one cell line than in any single developmental stage, suggesting that each line is enriched for genes characteristic of small sets of cells. Most remarkable is that imaginal disc-derived lines can generally be assigned, on the basis of expression, to small territories within developing discs. These mappings reveal unexpected stability of even fine-grained spatial determination. No two cell lines show identical transcription factor expression. We conclude that each line has retained features of an individual founder cell superimposed on a common "cell line" gene expression pattern.


Drosophila melanogaster/genetics , Genetic Variation , Transcription, Genetic , Animals , Cell Line , Cluster Analysis , Exons , Female , Gene Expression Profiling , Male , Molecular Sequence Data , Signal Transduction/genetics , Transcription Factors/genetics
...