ABSTRACT
Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.
Subject(s)
Gene Expression Regulation , Gene Regulatory Networks , Transcription Factors/metabolism , Animals , Cell Differentiation , Evolution, Molecular , Humans , Mice , Monocytes/cytology , Organ Specificity , Smad3 Protein/metabolism , Trans-Activators/metabolismABSTRACT
Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
Subject(s)
Databases, Genetic , RNA, Long Noncoding/chemistry , RNA, Long Noncoding/genetics , Transcriptome/genetics , Cells, Cultured , Conserved Sequence/genetics , Datasets as Topic , Enhancer Elements, Genetic/genetics , Epigenesis, Genetic , Gene Expression Profiling , Gene Expression Regulation , Genome, Human/genetics , Genome-Wide Association Study , Genomics , Humans , Internet , Molecular Sequence Annotation , Organ Specificity/genetics , Polymorphism, Single Nucleotide , Promoter Regions, Genetic/genetics , Quantitative Trait Loci/genetics , RNA Stability , RNA, Messenger/geneticsABSTRACT
Retinitis pigmentosa (RP) is the leading cause of blindness with nearly two million people affected worldwide. Many genes have been implicated in RP, yet in 30-80% of the RP patients the genetic cause remains unknown. A similar phenotype, progressive retinal atrophy (PRA), affects many dog breeds including the Miniature Schnauzer. We performed clinical, genetic and functional experiments to identify the genetic cause of PRA in the breed. The age of onset and pattern of disease progression suggested that at least two forms of PRA, types 1 and 2 respectively, affect the breed, which was confirmed by genome-wide association study that implicated two distinct genomic loci in chromosomes 15 and X, respectively. Whole-genome sequencing revealed a fully segregating recessive regulatory variant in type 1 PRA. The associated variant has a very recent origin based on haplotype analysis and lies within a regulatory site with the predicted binding site of HAND1::TCF3 transcription factor complex. Luciferase assays suggested that mutated regulatory sequence increases expression. Case-control retinal expression comparison of six best HAND1::TCF3 target genes were analyzed with quantitative reverse-transcriptase PCR assay and indicated overexpression of EDN2 and COL9A2 in the affected retina. Defects in both EDN2 and COL9A2 have been previously associated with retinal degeneration. In summary, our study describes two genetically different forms of PRA and identifies a fully penetrant variant in type 1 form with a possible regulatory effect. This would be among the first reports of a regulatory variant in retinal degeneration in any species, and establishes a new spontaneous dog model to improve our understanding of retinal biology and gene regulation while the affected breed will benefit from a reliable genetic testing.
Subject(s)
Dog Diseases/genetics , Retinal Degeneration/genetics , Retinitis Pigmentosa/genetics , Animals , Case-Control Studies , Collagen Type IX/genetics , Collagen Type IX/metabolism , Dogs , Endothelin-2/genetics , Endothelin-2/metabolism , Female , Frameshift Mutation/genetics , Genome-Wide Association Study/methods , Haplotypes/genetics , Male , Models, Animal , Mutation/genetics , Pedigree , Phenotype , Retina/metabolism , Retinitis Pigmentosa/metabolismABSTRACT
BACKGROUND: Distinguishing ductal carcinoma in situ (DCIS) from invasive ductal carcinoma (IDC) regions in clinical biopsies constitutes a diagnostic challenge. Spatial transcriptomics (ST) is an in situ capturing method, which allows quantification and visualization of transcriptomes in individual tissue sections. In the past, studies have shown that breast cancer samples can be used to study their transcriptomes with spatial resolution in individual tissue sections. Previously, supervised machine learning methods were used in clinical studies to predict the clinical outcomes for cancer types. METHODS: We used four publicly available ST breast cancer datasets from breast tissue sections annotated by pathologists as non-malignant, DCIS, or IDC. We trained and tested a machine learning method (support vector machine) based on the expert annotation as well as based on automatic selection of cell types by their transcriptome profiles. RESULTS: We identified expression signatures for expert annotated regions (non-malignant, DCIS, and IDC) and build machine learning models. Classification results for 798 expression signature transcripts showed high coincidence with the expert pathologist annotation for DCIS (100%) and IDC (96%). Extending our analysis to include all 25,179 expressed transcripts resulted in an accuracy of 99% for DCIS and 98% for IDC. Further, classification based on an automatically identified expression signature covering all ST spots of tissue sections resulted in prediction accuracy of 95% for DCIS and 91% for IDC. CONCLUSIONS: This concept study suggest that the ST signatures learned from expert selected breast cancer tissue sections can be used to identify breast cancer regions in whole tissue sections including regions not trained on. Furthermore, the identified expression signatures can classify cancer regions in tissue sections not used for training with high accuracy. Expert-generated but even automatically generated cancer signatures from ST data might be able to classify breast cancer regions and provide clinical decision support for pathologists in the future.
Subject(s)
Biomarkers, Tumor/genetics , Breast Neoplasms/diagnosis , Carcinoma, Ductal, Breast/diagnosis , Carcinoma, Intraductal, Noninfiltrating/diagnosis , Machine Learning , Molecular Typing/methods , Transcriptome , Breast Neoplasms/classification , Breast Neoplasms/genetics , Carcinoma, Ductal, Breast/genetics , Carcinoma, Intraductal, Noninfiltrating/genetics , Female , Humans , ROC Curve , Spatial AnalysisABSTRACT
Enhancers control the correct temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. Here we use the FANTOM5 panel of samples, covering the majority of human tissues and cell types, to produce an atlas of active, in vivo-transcribed enhancers. We show that enhancers share properties with CpG-poor messenger RNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity. The atlas is used to compare regulatory programs between different cells at unprecedented depth, to identify disease-associated regulatory single nucleotide polymorphisms, and to classify cell-type-specific and ubiquitous enhancers. We further explore the utility of enhancer redundancy, which explains gene expression strength rather than expression patterns. The online FANTOM5 enhancer atlas represents a unique resource for studies on cell-type-specific enhancers and gene regulation.
Subject(s)
Atlases as Topic , Enhancer Elements, Genetic/genetics , Gene Expression Regulation/genetics , Molecular Sequence Annotation , Organ Specificity , Cell Line , Cells, Cultured , Cluster Analysis , Genetic Predisposition to Disease/genetics , HeLa Cells , Humans , Polymorphism, Single Nucleotide/genetics , Promoter Regions, Genetic/genetics , RNA, Messenger/biosynthesis , RNA, Messenger/genetics , Transcription Initiation Site , Transcription Initiation, GeneticABSTRACT
BACKGROUND: The work of the FANTOM5 Consortium has brought forth a new level of understanding of the regulation of gene transcription and the cellular processes involved in creating diversity of cell types. In this study, we extended the analysis of the FANTOM5 Cap Analysis of Gene Expression (CAGE) transcriptome data to focus on understanding the genetic regulators involved in mouse cerebellar development. RESULTS: We used the HeliScopeCAGE library sequencing on cerebellar samples over 8 embryonic and 4 early postnatal times. This study showcases temporal expression pattern changes during cerebellar development. Through a bioinformatics analysis that focused on transcription factors, their promoters and binding sites, we identified genes that appear as strong candidates for involvement in cerebellar development. We selected several candidate transcriptional regulators for validation experiments including qRT-PCR and shRNA transcript knockdown. We observed marked and reproducible developmental defects in Atf4, Rfx3, and Scrt2 knockdown embryos, which support the role of these genes in cerebellar development. CONCLUSIONS: The successful identification of these novel gene regulators in cerebellar development demonstrates that the FANTOM5 cerebellum time series is a high-quality transcriptome database for functional investigation of gene regulatory networks in cerebellar development.
Subject(s)
Cerebellum/growth & development , Gene Expression Profiling , Nucleotide Motifs/genetics , Transcription, Genetic/genetics , Activating Transcription Factor 4/deficiency , Activating Transcription Factor 4/genetics , Activating Transcription Factor 4/metabolism , Animals , Cerebellum/embryology , Cerebellum/metabolism , Gene Expression Regulation, Developmental , Gene Knockdown Techniques , Mice , Mice, Inbred C57BL , Promoter Regions, Genetic/genetics , Regulatory Factor X Transcription Factors/deficiency , Regulatory Factor X Transcription Factors/genetics , Regulatory Factor X Transcription Factors/metabolism , Transcription Factors/deficiency , Transcription Factors/genetics , Transcription Factors/metabolismABSTRACT
BACKGROUND: Evolutionarily conserved RFX transcription factors (TFs) regulate their target genes through a DNA sequence motif called the X-box. Thereby they regulate cellular specialization and terminal differentiation. Here, we provide a comprehensive analysis of all the eight human RFX genes (RFX1-8), their spatial and temporal expression profiles, potential upstream regulators and target genes. RESULTS: We extracted all known human RFX1-8 gene expression profiles from the FANTOM5 database derived from transcription start site (TSS) activity as captured by Cap Analysis of Gene Expression (CAGE) technology. RFX genes are broadly (RFX1-3, RFX5, RFX7) and specifically (RFX4, RFX6) expressed in different cell types, with high expression in four organ systems: immune system, gastrointestinal tract, reproductive system and nervous system. Tissue type specific expression profiles link defined RFX family members with the target gene batteries they regulate. We experimentally confirmed novel TSS locations and characterized the previously undescribed RFX8 to be lowly expressed. RFX tissue and cell type specificity arises mainly from differences in TSS architecture. RFX transcript isoforms lacking a DNA binding domain (DBD) open up new possibilities for combinatorial target gene regulation. Our results favor a new grouping of the RFX family based on protein domain composition. We uncovered and experimentally confirmed the TFs SP2 and ESR1 as upstream regulators of specific RFX genes. Using TF binding profiles from the JASPAR database, we determined relevant patterns of X-box motif positioning with respect to gene TSS locations of human RFX target genes. CONCLUSIONS: The wealth of data we provide will serve as the basis for precisely determining the roles RFX TFs play in human development and disease.
Subject(s)
Gene Expression Regulation , Genome, Human , Promoter Regions, Genetic , Regulatory Factor X Transcription Factors/genetics , Regulatory Sequences, Nucleic Acid , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Humans , Transcription Initiation SiteABSTRACT
CORRECTION: The authors of the original article [1] would like to recognize the critical contribution of core members of the FANTOM5 Consortium, who played the critical role of HeliScopeCAGE sequencing experiments, quality control of tag reads and processing of the raw sequencing data.
ABSTRACT
Lymphangiogenesis plays a crucial role during development, in cancer metastasis and in inflammation. Activation of VEGFR-3 (also known as FLT4) by VEGF-C is one of the main drivers of lymphangiogenesis, but the transcriptional events downstream of VEGFR-3 activation are largely unknown. Recently, we identified a wave of immediate early transcription factors that are upregulated in human lymphatic endothelial cells (LECs) within the first 30 to 80â min after VEGFR-3 activation. Expression of these transcription factors must be regulated by additional pre-existing transcription factors that are rapidly activated by VEGFR-3 signaling. Using transcription factor activity analysis, we identified the homeobox transcription factor HOXD10 to be specifically activated at early time points after VEGFR-3 stimulation, and to regulate expression of immediate early transcription factors, including NR4A1. Gain- and loss-of-function studies revealed that HOXD10 is involved in LECs migration and formation of cord-like structures. Furthermore, HOXD10 regulates expression of VE-cadherin, claudin-5 and NOS3 (also known as e-NOS), and promotes lymphatic endothelial permeability. Taken together, these results reveal an important and unanticipated role of HOXD10 in the regulation of VEGFR-3 signaling in lymphatic endothelial cells, and in the control of lymphangiogenesis and permeability.
Subject(s)
Homeodomain Proteins/genetics , Neoplasms/genetics , Nuclear Receptor Subfamily 4, Group A, Member 1/genetics , Transcription Factors/genetics , Vascular Endothelial Growth Factor C/genetics , Vascular Endothelial Growth Factor Receptor-3/genetics , Cell Line , Cell Membrane Permeability/genetics , Cell Movement/genetics , Endothelial Cells/metabolism , Endothelial Cells/pathology , Gene Expression Regulation, Neoplastic , Humans , Lymphangiogenesis/genetics , Neoplasm Metastasis , Neoplasms/pathology , Signal Transduction , Vascular Endothelial Growth Factor C/biosynthesis , Vascular Endothelial Growth Factor Receptor-3/biosynthesisABSTRACT
Laser-capture microdissection was used to isolate external germinal layer tissue from three developmental periods of mouse cerebellar development: embryonic days 13, 15, and 18. The cerebellar granule cell-enriched mRNA library was generated with next-generation sequencing using the Helicos technology. Our objective was to discover transcriptional regulators that could be important for the development of cerebellar granule cells-the most numerous neuron in the central nervous system. Through differential expression analysis, we have identified 82 differentially expressed transcription factors (TFs) from a total of 1311 differentially expressed genes. In addition, with TF-binding sequence analysis, we have identified 46 TF candidates that could be key regulators responsible for the variation in the granule cell transcriptome between developmental stages. Altogether, we identified 125 potential TFs (82 from differential expression analysis, 46 from motif analysis with 3 overlaps in the two sets). From this gene set, 37 TFs are considered novel due to the lack of previous knowledge about their roles in cerebellar development. The results from transcriptome-wide analyses were validated with existing online databases, qRT-PCR, and in situ hybridization. This study provides an initial insight into the TFs of cerebellar granule cells that might be important for development and provide valuable information for further functional studies on these transcriptional regulators.
Subject(s)
Cerebellum/embryology , Cerebellum/metabolism , Neurons/metabolism , Transcription Factors/metabolism , Animals , Computer Simulation , Gene Expression Profiling , Gene Expression Regulation, Developmental , In Situ Hybridization , Laser Capture Microdissection , Mice, Inbred C57BL , Real-Time Polymerase Chain Reaction , TranscriptomeABSTRACT
Understanding the normal state of human tissue transcriptome profiles is essential for recognizing tissue disease states and identifying disease markers. Recently, the Human Protein Atlas and the FANTOM5 consortium have each published extensive transcriptome data for human samples using Illumina-sequenced RNA-Seq and Heliscope-sequenced CAGE. Here, we report on the first large-scale complex tissue transcriptome comparison between full-length versus 5'-capped mRNA sequencing data. Overall gene expression correlation was high between the 22 corresponding tissues analyzed (R > 0.8). For genes ubiquitously expressed across all tissues, the two data sets showed high genome-wide correlation (91% agreement), with differences observed for a small number of individual genes indicating the need to update their gene models. Among the identified single-tissue enriched genes, up to 75% showed consensus of 7-fold enrichment in the same tissue in both methods, while another 17% exhibited multiple tissue enrichment and/or high expression variety in the other data set, likely dependent on the cell type proportions included in each tissue sample. Our results show that RNA-Seq and CAGE tissue transcriptome data sets are highly complementary for improving gene model annotations and highlight biological complexities within tissue transcriptomes. Furthermore, integration with image-based protein expression data is highly advantageous for understanding expression specificities for many genes.
Subject(s)
Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Databases, Protein , Genomics/methods , Humans , Immunohistochemistry , Molecular Sequence Annotation , Proteome/metabolism , Tissue DistributionABSTRACT
Hematopoietic differentiation is governed by a complex regulatory program controlling the generation of different lineages of blood cells from multipotent hematopoietic stem cells. The transcriptional program that dictates hematopoietic cell fate and differentiation requires an epigenetic memory function provided by a network of epigenetic factors regulating DNA methylation, posttranslational histone modifications, and chromatin structure. Aberrant interactions between epigenetic factors and transcription factors cause perturbations in the blood cell differentiation program that result in various types of hematopoietic disorders. To elucidate the contributions of different epigenetic factors in human hematopoiesis, high-throughput cap analysis of gene expression was used to build transcription profiles of 199 epigenetic factors in a wide range of blood cells. Our epigenetic transcriptome analysis revealed cell type- (eg, HELLS and ACTL6A), lineage- (eg, MLL), and/or leukemia- (eg, CHD2, CBX8, and EPC1) specific expression of several epigenetic factors. In addition, we show that several epigenetic factors use alternative transcription start sites in different cell types. This analysis could serve as a resource for the scientific community for further characterization of the role of these epigenetic factors in blood development.
Subject(s)
Epigenesis, Genetic , Gene Expression Regulation , Hematopoiesis/genetics , Hematopoiesis/physiology , Cell Differentiation , Cell Lineage , DNA Methylation , Gene Expression Profiling , Hematopoietic Stem Cells/metabolism , Humans , Principal Component Analysis , Transcription, GeneticSubject(s)
Genomics , Molecular Sequence Annotation , Sequence Analysis, DNA , Computational Biology , Gene Ontology , Genomics/methods , Genomics/standards , Metadata , Molecular Sequence Annotation/methods , Molecular Sequence Annotation/standards , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/standardsABSTRACT
The immediate-early response mediates cell fate in response to a variety of extracellular stimuli and is dysregulated in many cancers. However, the specificity of the response across stimuli and cell types, and the roles of non-coding RNAs are not well understood. Using a large collection of densely-sampled time series expression data we have examined the induction of the immediate-early response in unparalleled detail, across cell types and stimuli. We exploit cap analysis of gene expression (CAGE) time series datasets to directly measure promoter activities over time. Using a novel analysis method for time series data we identify transcripts with expression patterns that closely resemble the dynamics of known immediate-early genes (IEGs) and this enables a comprehensive comparative study of these genes and their chromatin state. Surprisingly, these data suggest that the earliest transcriptional responses often involve promoters generating non-coding RNAs, many of which are produced in advance of canonical protein-coding IEGs. IEGs are known to be capable of induction without de novo protein synthesis. Consistent with this, we find that the response of both protein-coding and non-coding RNA IEGs can be explained by their transcriptionally poised, permissive chromatin state prior to stimulation. We also explore the function of non-coding RNAs in the attenuation of the immediate early response in a small RNA sequencing dataset matched to the CAGE data: We identify a novel set of microRNAs responsible for the attenuation of the IEG response in an estrogen receptor positive cancer cell line. Our computational statistical method is well suited to meta-analyses as there is no requirement for transcripts to pass thresholds for significant differential expression between time points, and it is agnostic to the number of time points per dataset.
Subject(s)
Immediate-Early Proteins/genetics , RNA, Untranslated/genetics , Transcription, Genetic/genetics , Computational Biology , Humans , Immediate-Early Proteins/metabolism , Kinetics , MCF-7 Cells , MicroRNAs/genetics , MicroRNAs/metabolism , Models, Statistical , RNA, Untranslated/metabolismABSTRACT
BACKGROUND: Children with problematic severe asthma have poor disease control despite high doses of inhaled corticosteroids and additional therapy, leading to personal suffering, early deterioration of lung function, and significant consumption of health care resources. If no exacerbating factors, such as smoking or allergies, are found after extensive investigation, these children are given a diagnosis of therapy-resistant (or therapy-refractory) asthma (SA). OBJECTIVE: We sought to deepen our understanding of childhood SA by analyzing gene expression and modeling the underlying regulatory transcription factor networks in peripheral blood leukocytes. METHODS: Gene expression was analyzed by using Cap Analysis of Gene Expression in children with SA (n = 13), children with controlled persistent asthma (n = 15), and age-matched healthy control subjects (n = 9). Cap Analysis of Gene Expression sequencing detects the transcription start sites of known and novel mRNAs and noncoding RNAs. RESULTS: Sample groups could be separated by hierarchical clustering on 1305 differentially expressed transcription start sites, including 816 known genes and several novel transcripts. Ten of 13 tested novel transcripts were validated by means of RT-PCR and Sanger sequencing. Expression of RAR-related orphan receptor A (RORA), which has been linked to asthma in genome-wide association studies, was significantly upregulated in patients with SA. Gene network modeling revealed decreased glucocorticoid receptor signaling and increased activity of the mitogen-activated protein kinase and Jun kinase cascades in patients with SA. CONCLUSION: Circulating leukocytes from children with controlled asthma and those with SA have distinct gene expression profiles, demonstrating the possible development of specific molecular biomarkers and supporting the need for novel therapeutic approaches.
Subject(s)
Asthma/drug therapy , Asthma/genetics , Drug Resistance/genetics , Glucocorticoids/therapeutic use , RNA, Messenger/genetics , Transcriptome , Adolescent , Asthma/pathology , Case-Control Studies , Child , Child, Preschool , Female , Gene Expression Profiling , Genome-Wide Association Study , Humans , JNK Mitogen-Activated Protein Kinases/genetics , Male , Nuclear Receptor Subfamily 1, Group F, Member 1/genetics , Receptors, Glucocorticoid/genetics , Severity of Illness IndexABSTRACT
Odorous chemicals are detected by the mouse main olfactory epithelium (MOE) by about 1100 types of olfactory receptors (OR) expressed by olfactory sensory neurons (OSNs). Each mature OSN is thought to express only one allele of a single OR gene. Major impediments to understand the transcriptional control of OR gene expression are the lack of a proper characterization of OR transcription start sites (TSSs) and promoters, and of regulatory transcripts at OR loci. We have applied the nanoCAGE technology to profile the transcriptome and the active promoters in the MOE. nanoCAGE analysis revealed the map and architecture of promoters for 87.5% of the mouse OR genes, as well as the expression of many novel noncoding RNAs including antisense transcripts. We identified candidate transcription factors for OR gene expression and among them confirmed by chromatin immunoprecipitation the binding of TBP, EBF1 (OLF1), and MEF2A to OR promoters. Finally, we showed that a short genomic fragment flanking the major TSS of the OR gene Olfr160 (M72) can drive OSN-specific expression in transgenic mice.
Subject(s)
Promoter Regions, Genetic , Receptors, Odorant/genetics , 3' Untranslated Regions , Animals , Base Sequence , Binding Sites , Consensus Sequence , DNA-Binding Proteins/metabolism , Gene Expression Profiling , Gene Expression Regulation , Gene Order , Genetic Loci , MEF2 Transcription Factors , Mice , Mice, Transgenic , Myogenic Regulatory Factors/metabolism , Olfactory Mucosa/metabolism , Olfactory Receptor Neurons/metabolism , TATA-Box Binding Protein/metabolism , Transcription Factors/metabolism , Transcription Initiation Site , Transcription, GeneticABSTRACT
Activation-induced cytidine deaminase (AID) is required for both somatic hypermutation and class-switch recombination in activated B cells. AID is also known to target nonimmunoglobulin genes and introduce mutations or chromosomal translocations, eventually causing tumors. To identify as-yet-unknown AID targets, we screened early AID-induced DNA breaks by using two independent genome-wide approaches. Along with known AID targets, this screen identified a set of unique genes (SNHG3, MALAT1, BCL7A, and CUX1) and confirmed that these loci accumulated mutations as frequently as Ig locus after AID activation. Moreover, these genes share three important characteristics with the Ig gene: translocations in tumors, repetitive sequences, and the epigenetic modification of chromatin by H3K4 trimethylation in the vicinity of cleavage sites.
Subject(s)
Cytidine Deaminase/genetics , Genes, Immunoglobulin , Biotin/metabolism , Humans , Mutation , Polymerase Chain ReactionABSTRACT
Evolutionary change in gene expression is generally considered to be a major driver of phenotypic differences between species. We investigated innate immune diversification by analyzing interspecies differences in the transcriptional responses of primary human and mouse macrophages to the Toll-like receptor (TLR)-4 agonist lipopolysaccharide (LPS). By using a custom platform permitting cross-species interrogation coupled with deep sequencing of mRNA 5' ends, we identified extensive divergence in LPS-regulated orthologous gene expression between humans and mice (24% of orthologues were identified as "divergently regulated"). We further demonstrate concordant regulation of human-specific LPS target genes in primary pig macrophages. Divergently regulated orthologues were enriched for genes encoding cellular "inputs" such as cell surface receptors (e.g., TLR6, IL-7Rα) and functional "outputs" such as inflammatory cytokines/chemokines (e.g., CCL20, CXCL13). Conversely, intracellular signaling components linking inputs to outputs were typically concordantly regulated. Functional consequences of divergent gene regulation were confirmed by showing LPS pretreatment boosts subsequent TLR6 responses in mouse but not human macrophages, in keeping with mouse-specific TLR6 induction. Divergently regulated genes were associated with a large dynamic range of gene expression, and specific promoter architectural features (TATA box enrichment, CpG island depletion). Surprisingly, regulatory divergence was also associated with enhanced interspecies promoter conservation. Thus, the genes controlled by complex, highly conserved promoters that facilitate dynamic regulation are also the most susceptible to evolutionary change.
Subject(s)
Gene Expression Profiling , Genetic Variation , Macrophages/metabolism , Toll-Like Receptor 4/genetics , Animals , Cell Line , Cells, Cultured , Chemokine CCL20/genetics , Chemokine CXCL13/genetics , Evolution, Molecular , Female , Gene Expression Regulation/drug effects , Host-Pathogen Interactions , Humans , Lipopolysaccharides/pharmacology , Macrophages/drug effects , Macrophages/microbiology , Male , Mice , Mice, Inbred BALB C , Mice, Inbred C57BL , Mice, Knockout , Oligonucleotide Array Sequence Analysis , Reverse Transcriptase Polymerase Chain Reaction , Salmonella typhimurium/physiology , Species Specificity , Swine , Toll-Like Receptor 4/agonistsABSTRACT
MicroRNAs are small non-coding RNAs that inhibit the translation of target mRNAs. In humans, most microRNAs are transcribed by RNA polymerase II as long primary transcripts and processed by sequential cleavage of the two RNase III enzymes, DROSHA and DICER, into precursor and mature microRNAs, respectively. Although the fundamental functions of microRNAs in RNA silencing have been gradually uncovered, less is known about the regulatory mechanisms of microRNA expression. Here, we report that telomerase reverse transcriptase (TERT) extensively affects the expression levels of mature microRNAs. Deep sequencing-based screens of short RNA populations revealed that the suppression of TERT resulted in the downregulation of microRNAs expressed in THP-1 cells and HeLa cells. Primary and precursor microRNA levels were also reduced under the suppression of TERT. Similar results were obtained with the suppression of either BRG1 (also called SMARCA4) or nucleostemin, which are proteins interacting with TERT and functioning beyond telomeres. These results suggest that TERT regulates microRNAs at the very early phases in their biogenesis, presumably through non-telomerase mechanism(s).
Subject(s)
MicroRNAs/metabolism , Telomerase/metabolism , Cell Line, Tumor , DEAD-box RNA Helicases/genetics , DEAD-box RNA Helicases/metabolism , DNA Helicases/antagonists & inhibitors , DNA Helicases/genetics , DNA Helicases/metabolism , Down-Regulation , GTP-Binding Proteins/antagonists & inhibitors , GTP-Binding Proteins/genetics , GTP-Binding Proteins/metabolism , HeLa Cells , Humans , Nuclear Proteins/antagonists & inhibitors , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , RNA Interference , RNA, Small Interfering/metabolism , Ribonuclease III/genetics , Ribonuclease III/metabolism , Telomerase/antagonists & inhibitors , Telomerase/genetics , Transcription Factors/antagonists & inhibitors , Transcription Factors/genetics , Transcription Factors/metabolismABSTRACT
BACKGROUND: Deciphering the most common modes by which chromatin regulates transcription, and how this is related to cellular status and processes is an important task for improving our understanding of human cellular biology. The FANTOM5 and ENCODE projects represent two independent large scale efforts to map regulatory and transcriptional features to the human genome. Here we investigate chromatin features around a comprehensive set of transcription start sites in four cell lines by integrating data from these two projects. RESULTS: Transcription start sites can be distinguished by chromatin states defined by specific combinations of both chromatin mark enrichment and the profile shapes of these chromatin marks. The observed patterns can be associated with cellular functions and processes, and they also show association with expression level, location relative to nearby genes, and CpG content. In particular we find a substantial number of repressed inter- and intra-genic transcription start sites enriched for active chromatin marks and Pol II, and these sites are strongly associated with immediate-early response processes and cell signaling. Associations between start sites with similar chromatin patterns are validated by significant correlations in their global expression profiles. CONCLUSIONS: The results confirm the link between chromatin state and cellular function for expressed transcripts, and also indicate that active chromatin states at repressed transcripts may poise transcripts for rapid activation during immune response.