Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 110
Filter
1.
Genome Res ; 2022 Aug 12.
Article in English | MEDLINE | ID: mdl-35961773

ABSTRACT

In eukaryotes, capped RNAs include long transcripts such as messenger RNAs and long noncoding RNAs, as well as shorter transcripts such as spliceosomal RNAs, small nucleolar RNAs, and enhancer RNAs. Long capped transcripts can be profiled using cap analysis gene expression (CAGE) sequencing and other methods. Here, we describe a sequencing library preparation protocol for short capped RNAs, apply it to a differentiation time course of the human cell line THP-1, and systematically compare the landscape of short capped RNAs to that of long capped RNAs. Transcription initiation peaks associated with genes in the sense direction have a strong preference to produce either long or short capped RNAs, with one out of six peaks detected in the short capped RNA libraries only. Gene-associated short capped RNAs have highly specific 3' ends, typically overlapping splice sites. Enhancers also preferentially generate either short or long capped RNAs, with 10% of enhancers observed in the short capped RNA libraries only. Enhancers producing either short or long capped RNAs show enrichment for GWAS-associated disease SNPs. We conclude that deep sequencing of short capped RNAs reveals new families of noncoding RNAs and elucidates the diversity of transcripts generated at known and novel promoters and enhancers.

2.
Jpn J Clin Oncol ; 54(1): 38-46, 2024 Jan 07.
Article in English | MEDLINE | ID: mdl-37815156

ABSTRACT

OBJECTIVE: Endometrial cancer is the most common gynaecological cancer, and most patients are identified during early disease stages. Noninvasive evaluation of lymph node metastasis likely will improve the quality of clinical treatment, for example, by omitting unnecessary lymphadenectomy. METHODS: The study population comprised 611 patients with endometrial cancer who underwent lymphadenectomy at four types of institutions, comprising seven hospitals in total. We systematically assessed the association of 18 preoperative clinical variables with postoperative lymph node metastasis. We then constructed statistical models for preoperative lymph node metastasis prediction and assessed their performance with a previously proposed system, in which the score was determined by counting the number of high-risk variables among the four predefined ones. RESULTS: Of the preoperative 18 variables evaluated, 10 were significantly associated with postoperative lymph node metastasis. A logistic regression model achieved an area under the curve of 0.85 in predicting lymph node metastasis; this value is significantly higher than that from the previous system (area under the curve, 0.74). When we set the false-negative rate to ~1%, the new predictive model increased the rate of true negatives to 21%, compared with 6.8% from the previous one. We also provide a spreadsheet-based tool for further evaluation of its ability to predict lymph node metastasis in endometrial cancer. CONCLUSIONS: Our new lymph node metastasis prediction method, which was based solely on preoperative clinical variables, performed significantly better than the previous method. Although additional evaluation is necessary for its clinical use, our noninvasive system may help improve the clinical treatment of endometrial cancer, complementing minimally invasive sentinel lymph node biopsy.


Subject(s)
Endometrial Neoplasms , Sentinel Lymph Node Biopsy , Female , Humans , Lymphatic Metastasis/pathology , Lymph Node Excision , Endometrial Neoplasms/surgery , Endometrial Neoplasms/pathology , Models, Statistical , Lymph Nodes/surgery , Lymph Nodes/pathology
3.
Genome Res ; 30(7): 1073-1081, 2020 07.
Article in English | MEDLINE | ID: mdl-32079618

ABSTRACT

Long noncoding RNAs (lncRNAs) have emerged as key coordinators of biological and cellular processes. Characterizing lncRNA expression across cells and tissues is key to understanding their role in determining phenotypes, including human diseases. We present here FC-R2, a comprehensive expression atlas across a broadly defined human transcriptome, inclusive of over 109,000 coding and noncoding genes, as described in the FANTOM CAGE-Associated Transcriptome (FANTOM-CAT) study. This atlas greatly extends the gene annotation used in the original recount2 resource. We demonstrate the utility of the FC-R2 atlas by reproducing key findings from published large studies and by generating new results across normal and diseased human samples. In particular, we (a) identify tissue-specific transcription profiles for distinct classes of coding and noncoding genes, (b) perform differential expression analysis across thirteen cancer types, identifying novel noncoding genes potentially involved in tumor pathogenesis and progression, and (c) confirm the prognostic value for several enhancer lncRNAs expression in cancer. Our resource is instrumental for the systematic molecular characterization of lncRNA by the FANTOM6 Consortium. In conclusion, comprised of over 70,000 samples, the FC-R2 atlas will empower other researchers to investigate functions and biological roles of both known coding genes and novel lncRNAs.


Subject(s)
Transcriptome , Databases, Genetic , Enhancer Elements, Genetic , Gene Expression Profiling , Genome, Human , Humans , Neoplasms/genetics , Organ Specificity , Prognosis , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , RNA, Messenger/metabolism
4.
Genome Res ; 30(7): 951-961, 2020 07.
Article in English | MEDLINE | ID: mdl-32718981

ABSTRACT

Gene expression profiles in homologous tissues have been observed to be different between species, which may be due to differences between species in the gene expression program in each cell type, but may also reflect differences in cell type composition of each tissue in different species. Here, we compare expression profiles in matching primary cells in human, mouse, rat, dog, and chicken using Cap Analysis Gene Expression (CAGE) and short RNA (sRNA) sequencing data from FANTOM5. While we find that expression profiles of orthologous genes in different species are highly correlated across cell types, in each cell type many genes were differentially expressed between species. Expression of genes with products involved in transcription, RNA processing, and transcriptional regulation was more likely to be conserved, while expression of genes encoding proteins involved in intercellular communication was more likely to have diverged during evolution. Conservation of expression correlated positively with the evolutionary age of genes, suggesting that divergence in expression levels of genes critical for cell function was restricted during evolution. Motif activity analysis showed that both promoters and enhancers are activated by the same transcription factors in different species. An analysis of expression levels of mature miRNAs and of primary miRNAs identified by CAGE revealed that evolutionary old miRNAs are more likely to have conserved expression patterns than young miRNAs. We conclude that key aspects of the regulatory network are conserved, while differential expression of genes involved in cell-to-cell communication may contribute greatly to phenotypic differences between species.


Subject(s)
Evolution, Molecular , Transcriptome , Animals , Chickens/genetics , Dogs , Gene Expression Profiling , Gene Regulatory Networks , Humans , Mice , MicroRNAs/metabolism , Nucleotide Motifs , Principal Component Analysis , Promoter Regions, Genetic , Rats , Species Specificity , Transcription Factors/metabolism
5.
Genome Res ; 30(7): 1060-1072, 2020 07.
Article in English | MEDLINE | ID: mdl-32718982

ABSTRACT

Long noncoding RNAs (lncRNAs) constitute the majority of transcripts in the mammalian genomes, and yet, their functions remain largely unknown. As part of the FANTOM6 project, we systematically knocked down the expression of 285 lncRNAs in human dermal fibroblasts and quantified cellular growth, morphological changes, and transcriptomic responses using Capped Analysis of Gene Expression (CAGE). Antisense oligonucleotides targeting the same lncRNAs exhibited global concordance, and the molecular phenotype, measured by CAGE, recapitulated the observed cellular phenotypes while providing additional insights on the affected genes and pathways. Here, we disseminate the largest-to-date lncRNA knockdown data set with molecular phenotyping (over 1000 CAGE deep-sequencing libraries) for further exploration and highlight functional roles for ZNF213-AS1 and lnc-KHDC3L-2.


Subject(s)
RNA, Long Noncoding/physiology , Cell Growth Processes/genetics , Cell Movement/genetics , Fibroblasts/cytology , Fibroblasts/metabolism , Humans , KCNQ Potassium Channels/metabolism , Molecular Sequence Annotation , Oligonucleotides, Antisense , RNA, Long Noncoding/antagonists & inhibitors , RNA, Long Noncoding/metabolism , RNA, Small Interfering
6.
Nature ; 543(7644): 199-204, 2017 03 09.
Article in English | MEDLINE | ID: mdl-28241135

ABSTRACT

Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.


Subject(s)
Databases, Genetic , RNA, Long Noncoding/chemistry , RNA, Long Noncoding/genetics , Transcriptome/genetics , Cells, Cultured , Conserved Sequence/genetics , Datasets as Topic , Enhancer Elements, Genetic/genetics , Epigenesis, Genetic , Gene Expression Profiling , Gene Expression Regulation , Genome, Human/genetics , Genome-Wide Association Study , Genomics , Humans , Internet , Molecular Sequence Annotation , Organ Specificity/genetics , Polymorphism, Single Nucleotide , Promoter Regions, Genetic/genetics , Quantitative Trait Loci/genetics , RNA Stability , RNA, Messenger/genetics
7.
Cancer Sci ; 112(2): 884-892, 2021 Feb.
Article in English | MEDLINE | ID: mdl-33280191

ABSTRACT

Discrimination of Philadelphia-negative myeloproliferative neoplasms (Ph-MPNs) from reactive hypercytosis and myelofibrosis requires a constellation of testing including driver mutation analysis and bone marrow biopsies. We searched for a biomarker that can more easily distinguish Ph-MPNs from reactive hypercytosis and myelofibrosis by using RNA-seq analysis utilizing platelet-rich plasma (PRP)-derived RNAs from patients with essential thrombocythemia (ET) and reactive thrombocytosis, and CREB3L1 was found to have an extremely high impact in discriminating the two disorders. To validate and further explore the result, expression levels of CREB3L1 in PRP were quantified by reverse-transcription quantitative PCR and compared among patients with ET, other Ph-MPNs, chronic myeloid leukemia (CML), and reactive hypercytosis and myelofibrosis. A CREB3L1 expression cutoff value determined based on PRP of 18 healthy volunteers accurately discriminated 150 driver mutation-positive Ph-MPNs from other entities (71 reactive hypercytosis and myelofibrosis, 6 CML, and 18 healthy volunteers) and showed both sensitivity and specificity of 1.0000. Importantly, CREB3L1 expression levels were significantly higher in ET compared with reactive thrombocytosis (PĀ <Ā .0001), and polycythemia vera compared with reactive erythrocytosis (PĀ <Ā .0001). Pathology-affirmed triple-negative ET (TN-ET) patients were divided into a high- and low-CREB3L1-expression group, and some patients in the low-expression group achieved a spontaneous remission during the clinical course. In conclusion, CREB3L1 analysis has the potential to single-handedly discriminate driver mutation-positive Ph-MPNs from reactive hypercytosis and myelofibrosis, and also may identify a subgroup within TN-ET showing distinct clinical features including spontaneous remission.


Subject(s)
Biomarkers, Tumor/blood , Cyclic AMP Response Element-Binding Protein/blood , Myeloproliferative Disorders/diagnosis , Nerve Tissue Proteins/blood , Diagnosis, Differential , Humans , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/blood , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/diagnosis , Myeloproliferative Disorders/blood
8.
PLoS Biol ; 15(9): e2002887, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28873399

ABSTRACT

Cap Analysis of Gene Expression (CAGE) in combination with single-molecule sequencing technology allows precision mapping of transcription start sites (TSSs) and genome-wide capture of promoter activities in differentiated and steady state cell populations. Much less is known about whether TSS profiling can characterize diverse and non-steady state cell populations, such as the approximately 400 transitory and heterogeneous cell types that arise during ontogeny of vertebrate animals. To gain such insight, we used the chick model and performed CAGE-based TSS analysis on embryonic samples covering the full 3-week developmental period. In total, 31,863 robust TSS peaks (>1 tag per million [TPM]) were mapped to the latest chicken genome assembly, of which 34% to 46% were active in any given developmental stage. ZENBU, a web-based, open-source platform, was used for interactive data exploration. TSSs of genes critical for lineage differentiation could be precisely mapped and their activities tracked throughout development, suggesting that non-steady state and heterogeneous cell populations are amenable to CAGE-based transcriptional analysis. Our study also uncovered a large set of extremely stable housekeeping TSSs and many novel stage-specific ones. We furthermore demonstrated that TSS mapping could expedite motif-based promoter analysis for regulatory modules associated with stage-specific and housekeeping genes. Finally, using Brachyury as an example, we provide evidence that precise TSS mapping in combination with Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-on technology enables us, for the first time, to efficiently target endogenous avian genes for transcriptional activation. Taken together, our results represent the first report of genome-wide TSS mapping in birds and the first systematic developmental TSS analysis in any amniote species (birds and mammals). By facilitating promoter-based molecular analysis and genetic manipulation, our work also underscores the value of avian models in unravelling the complex regulatory mechanism of cell lineage specification during amniote development.


Subject(s)
Embryonic Development , Genome-Wide Association Study , Transcription Initiation Site , Animals , Biological Evolution , Chick Embryo , Clustered Regularly Interspaced Short Palindromic Repeats
9.
Nature ; 507(7493): 462-70, 2014 Mar 27.
Article in English | MEDLINE | ID: mdl-24670764

ABSTRACT

Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.


Subject(s)
Atlases as Topic , Molecular Sequence Annotation , Promoter Regions, Genetic/genetics , Transcriptome/genetics , Animals , Cell Line , Cells, Cultured , Cluster Analysis , Conserved Sequence/genetics , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Genes, Essential/genetics , Genome/genetics , Humans , Mice , Open Reading Frames/genetics , Organ Specificity , RNA, Messenger/analysis , RNA, Messenger/genetics , Transcription Factors/metabolism , Transcription Initiation Site , Transcription, Genetic/genetics
10.
Nature ; 507(7493): 455-461, 2014 Mar 27.
Article in English | MEDLINE | ID: mdl-24670763

ABSTRACT

Enhancers control the correct temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. Here we use the FANTOM5 panel of samples, covering the majority of human tissues and cell types, to produce an atlas of active, in vivo-transcribed enhancers. We show that enhancers share properties with CpG-poor messenger RNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity. The atlas is used to compare regulatory programs between different cells at unprecedented depth, to identify disease-associated regulatory single nucleotide polymorphisms, and to classify cell-type-specific and ubiquitous enhancers. We further explore the utility of enhancer redundancy, which explains gene expression strength rather than expression patterns. The online FANTOM5 enhancer atlas represents a unique resource for studies on cell-type-specific enhancers and gene regulation.


Subject(s)
Atlases as Topic , Enhancer Elements, Genetic/genetics , Gene Expression Regulation/genetics , Molecular Sequence Annotation , Organ Specificity , Cell Line , Cells, Cultured , Cluster Analysis , Genetic Predisposition to Disease/genetics , HeLa Cells , Humans , Polymorphism, Single Nucleotide/genetics , Promoter Regions, Genetic/genetics , RNA, Messenger/biosynthesis , RNA, Messenger/genetics , Transcription Initiation Site , Transcription Initiation, Genetic
11.
Nucleic Acids Res ; 46(22): 11898-11909, 2018 12 14.
Article in English | MEDLINE | ID: mdl-30407537

ABSTRACT

MicroRNAs (miRNAs) modulate the post-transcriptional regulation of target genes and are related to biology of complex human traits, but genetic landscape of miRNAs remains largely unknown. Given the strikingly tissue-specific miRNA expression profiles, we here expand a previous method to quantitatively evaluate enrichment of genome-wide association study (GWAS) signals on miRNA-target gene networks (MIGWAS) to further estimate tissue-specific enrichment. Our approach integrates tissue-specific expression profiles of miRNAs (Ć¢ĀˆĀ¼1800 miRNAs in 179 cells) with GWAS to test whether polygenic signals enrich in miRNA-target gene networks and whether they fall within specific tissues. We applied MIGWAS to 49 GWASs (nTotal = 3 520 246), and successfully identified biologically relevant tissues. Further, MIGWAS could point miRNAs as candidate biomarkers of the trait. As an illustrative example, we performed differentially expressed miRNA analysis between rheumatoid arthritis (RA) patients and healthy controls (n = 63). We identified novel biomarker miRNAs (e.g. hsa-miR-762) by integrating differentially expressed miRNAs with MIGWAS results for RA, as well as novel associated loci with significant genetic risk (rs56656810 at MIR762 at 16q11; n = 91 482, P = 3.6 Ɨ 10-8). Our result highlighted that miRNA-target gene network contributes to human disease genetics in a cell type-specific manner, which could yield an efficient screening of miRNAs as promising biomarkers.


Subject(s)
Arthritis, Rheumatoid/genetics , Asthma/genetics , Colitis, Ulcerative/genetics , Gene Regulatory Networks , Genome, Human , Graves Disease/genetics , MicroRNAs/genetics , Algorithms , Arthritis, Rheumatoid/immunology , Arthritis, Rheumatoid/pathology , Asthma/immunology , Asthma/pathology , Biomarkers/metabolism , Case-Control Studies , Colitis, Ulcerative/immunology , Colitis, Ulcerative/pathology , Computational Biology/methods , Gene Expression Profiling , Gene Expression Regulation , Genetic Loci , Genome-Wide Association Study , Graves Disease/immunology , Graves Disease/pathology , Humans , MicroRNAs/classification , MicroRNAs/metabolism , Multifactorial Inheritance/genetics , Multifactorial Inheritance/immunology , Organ Specificity , Signal Transduction
12.
PLoS Genet ; 13(3): e1006641, 2017 03.
Article in English | MEDLINE | ID: mdl-28263993

ABSTRACT

The FANTOM5 consortium utilised cap analysis of gene expression (CAGE) to provide an unprecedented insight into transcriptional regulation in human cells and tissues. In the current study, we have used CAGE-based transcriptional profiling on an extended dense time course of the response of human monocyte-derived macrophages grown in macrophage colony-stimulating factor (CSF1) to bacterial lipopolysaccharide (LPS). We propose that this system provides a model for the differentiation and adaptation of monocytes entering the intestinal lamina propria. The response to LPS is shown to be a cascade of successive waves of transient gene expression extending over at least 48 hours, with hundreds of positive and negative regulatory loops. Promoter analysis using motif activity response analysis (MARA) identified some of the transcription factors likely to be responsible for the temporal profile of transcriptional activation. Each LPS-inducible locus was associated with multiple inducible enhancers, and in each case, transient eRNA transcription at multiple sites detected by CAGE preceded the appearance of promoter-associated transcripts. LPS-inducible long non-coding RNAs were commonly associated with clusters of inducible enhancers. We used these data to re-examine the hundreds of loci associated with susceptibility to inflammatory bowel disease (IBD) in genome-wide association studies. Loci associated with IBD were strongly and specifically (relative to rheumatoid arthritis and unrelated traits) enriched for promoters that were regulated in monocyte differentiation or activation. Amongst previously-identified IBD susceptibility loci, the vast majority contained at least one promoter that was regulated in CSF1-dependent monocyte-macrophage transitions and/or in response to LPS. On this basis, we concluded that IBD loci are strongly-enriched for monocyte-specific genes, and identified at least 134 additional candidate genes associated with IBD susceptibility from reanalysis of published GWA studies. We propose that dysregulation of monocyte adaptation to the environment of the gastrointestinal mucosa is the key process leading to inflammatory bowel disease.


Subject(s)
Inflammatory Bowel Diseases/genetics , Macrophages/cytology , Monocytes/cytology , Transcriptome , Amino Acid Motifs , Cell Differentiation , Cytokines/metabolism , Gene Expression Regulation , Genetic Predisposition to Disease , Genome-Wide Association Study , Genomics , Humans , Inflammation , Inflammatory Bowel Diseases/etiology , Intestinal Mucosa/metabolism , Ligands , Lipopolysaccharides/pharmacology , Macrophage Colony-Stimulating Factor/pharmacology , Multigene Family , Promoter Regions, Genetic , Time Factors , Transcription Factors/metabolism , Transcription, Genetic , Transcriptional Activation
13.
BMC Genomics ; 20(1): 718, 2019 Sep 18.
Article in English | MEDLINE | ID: mdl-31533632

ABSTRACT

BACKGROUND: The work of the FANTOM5 Consortium has brought forth a new level of understanding of the regulation of gene transcription and the cellular processes involved in creating diversity of cell types. In this study, we extended the analysis of the FANTOM5 Cap Analysis of Gene Expression (CAGE) transcriptome data to focus on understanding the genetic regulators involved in mouse cerebellar development. RESULTS: We used the HeliScopeCAGE library sequencing on cerebellar samples over 8 embryonic and 4 early postnatal times. This study showcases temporal expression pattern changes during cerebellar development. Through a bioinformatics analysis that focused on transcription factors, their promoters and binding sites, we identified genes that appear as strong candidates for involvement in cerebellar development. We selected several candidate transcriptional regulators for validation experiments including qRT-PCR and shRNA transcript knockdown. We observed marked and reproducible developmental defects in Atf4, Rfx3, and Scrt2 knockdown embryos, which support the role of these genes in cerebellar development. CONCLUSIONS: The successful identification of these novel gene regulators in cerebellar development demonstrates that the FANTOM5 cerebellum time series is a high-quality transcriptome database for functional investigation of gene regulatory networks in cerebellar development.


Subject(s)
Cerebellum/growth & development , Gene Expression Profiling , Nucleotide Motifs/genetics , Transcription, Genetic/genetics , Activating Transcription Factor 4/deficiency , Activating Transcription Factor 4/genetics , Activating Transcription Factor 4/metabolism , Animals , Cerebellum/embryology , Cerebellum/metabolism , Gene Expression Regulation, Developmental , Gene Knockdown Techniques , Mice , Mice, Inbred C57BL , Promoter Regions, Genetic/genetics , Regulatory Factor X Transcription Factors/deficiency , Regulatory Factor X Transcription Factors/genetics , Regulatory Factor X Transcription Factors/metabolism , Transcription Factors/deficiency , Transcription Factors/genetics , Transcription Factors/metabolism
14.
PLoS Comput Biol ; 14(3): e1005934, 2018 03.
Article in English | MEDLINE | ID: mdl-29494619

ABSTRACT

Genetic variants underlying complex traits, including disease susceptibility, are enriched within the transcriptional regulatory elements, promoters and enhancers. There is emerging evidence that regulatory elements associated with particular traits or diseases share similar patterns of transcriptional activity. Accordingly, shared transcriptional activity (coexpression) may help prioritise loci associated with a given trait, and help to identify underlying biological processes. Using cap analysis of gene expression (CAGE) profiles of promoter- and enhancer-derived RNAs across 1824 human samples, we have analysed coexpression of RNAs originating from trait-associated regulatory regions using a novel quantitative method (network density analysis; NDA). For most traits studied, phenotype-associated variants in regulatory regions were linked to tightly-coexpressed networks that are likely to share important functional characteristics. Coexpression provides a new signal, independent of phenotype association, to enable fine mapping of causative variants. The NDA coexpression approach identifies new genetic variants associated with specific traits, including an association between the regulation of the OCT1 cation transporter and genetic variants underlying circulating cholesterol levels. NDA strongly implicates particular cell types and tissues in disease pathogenesis. For example, distinct groupings of disease-associated regulatory regions implicate two distinct biological processes in the pathogenesis of ulcerative colitis; a further two separate processes are implicated in Crohn's disease. Thus, our functional analysis of genetic predisposition to disease defines new distinct disease endotypes. We predict that patients with a preponderance of susceptibility variants in each group are likely to respond differently to pharmacological therapy. Together, these findings enable a deeper biological understanding of the causal basis of complex traits.


Subject(s)
Genetic Predisposition to Disease/genetics , Genomics/methods , Promoter Regions, Genetic/genetics , Crohn Disease/genetics , Databases, Genetic , Gene Expression Profiling , Humans , Transcriptome/genetics
15.
BMC Genomics ; 19(1): 39, 2018 01 11.
Article in English | MEDLINE | ID: mdl-29325522

ABSTRACT

CORRECTION: The authors of the original article [1] would like to recognize the critical contribution of core members of the FANTOM5 Consortium, who played the critical role of HeliScopeCAGE sequencing experiments, quality control of tag reads and processing of the raw sequencing data.

16.
J Cell Sci ; 129(13): 2573-85, 2016 07 01.
Article in English | MEDLINE | ID: mdl-27199372

ABSTRACT

Lymphangiogenesis plays a crucial role during development, in cancer metastasis and in inflammation. Activation of VEGFR-3 (also known as FLT4) by VEGF-C is one of the main drivers of lymphangiogenesis, but the transcriptional events downstream of VEGFR-3 activation are largely unknown. Recently, we identified a wave of immediate early transcription factors that are upregulated in human lymphatic endothelial cells (LECs) within the first 30 to 80Ć¢Ā€Ā…min after VEGFR-3 activation. Expression of these transcription factors must be regulated by additional pre-existing transcription factors that are rapidly activated by VEGFR-3 signaling. Using transcription factor activity analysis, we identified the homeobox transcription factor HOXD10 to be specifically activated at early time points after VEGFR-3 stimulation, and to regulate expression of immediate early transcription factors, including NR4A1. Gain- and loss-of-function studies revealed that HOXD10 is involved in LECs migration and formation of cord-like structures. Furthermore, HOXD10 regulates expression of VE-cadherin, claudin-5 and NOS3 (also known as e-NOS), and promotes lymphatic endothelial permeability. Taken together, these results reveal an important and unanticipated role of HOXD10 in the regulation of VEGFR-3 signaling in lymphatic endothelial cells, and in the control of lymphangiogenesis and permeability.


Subject(s)
Homeodomain Proteins/genetics , Neoplasms/genetics , Nuclear Receptor Subfamily 4, Group A, Member 1/genetics , Transcription Factors/genetics , Vascular Endothelial Growth Factor C/genetics , Vascular Endothelial Growth Factor Receptor-3/genetics , Cell Line , Cell Membrane Permeability/genetics , Cell Movement/genetics , Endothelial Cells/metabolism , Endothelial Cells/pathology , Gene Expression Regulation, Neoplastic , Humans , Lymphangiogenesis/genetics , Neoplasm Metastasis , Neoplasms/pathology , Signal Transduction , Vascular Endothelial Growth Factor C/biosynthesis , Vascular Endothelial Growth Factor Receptor-3/biosynthesis
17.
Genome Res ; 25(10): 1546-57, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26228054

ABSTRACT

Promoters are central to the regulation of gene expression. Changes in gene regulation are thought to underlie much of the adaptive diversification between species and phenotypic variation within populations. In contrast to earlier work emphasizing the importance of enhancer evolution and subtle sequence changes at promoters, we show that dramatic changes such as the complete gain and loss (collectively, turnover) of functional promoters are common. Using quantitative measures of transcription initiation in both humans and mice across 52 matched tissues, we discriminate promoter sequence gains from losses and resolve the lineage of changes. We also identify expression divergence and functional turnover between orthologous promoters, finding only the latter is associated with local sequence changes. Promoter turnover has occurred at the majority (>56%) of protein-coding genes since humans and mice diverged. Tissue-restricted promoters are the most evolutionarily volatile where retrotransposition is an important, but not the sole, source of innovation. There is considerable heterogeneity of turnover rates between promoters in different tissues, but the consistency of these in both lineages suggests that the same biological systems are similarly inclined to transcriptional rewiring. The genes affected by promoter turnover show evidence of adaptive evolution. In mice, promoters are primarily lost through deletion of the promoter containing sequence, whereas in humans, many promoters appear to be gradually decaying with weak transcriptional output and relaxed selective constraint. Our results suggest that promoter gain and loss is an important process in the evolutionary rewiring of gene regulation and may be a significant source of phenotypic diversification.


Subject(s)
Evolution, Molecular , Promoter Regions, Genetic , Animals , Base Sequence , Conserved Sequence , DNA , DNA Transposable Elements , Humans , Mice , Mutagenesis, Insertional , Sequence Deletion , Species Specificity
18.
Cerebellum ; 17(3): 308-325, 2018 Jun.
Article in English | MEDLINE | ID: mdl-29307116

ABSTRACT

Laser-capture microdissection was used to isolate external germinal layer tissue from three developmental periods of mouse cerebellar development: embryonic days 13, 15, and 18. The cerebellar granule cell-enriched mRNA library was generated with next-generation sequencing using the Helicos technology. Our objective was to discover transcriptional regulators that could be important for the development of cerebellar granule cells-the most numerous neuron in the central nervous system. Through differential expression analysis, we have identified 82 differentially expressed transcription factors (TFs) from a total of 1311 differentially expressed genes. In addition, with TF-binding sequence analysis, we have identified 46 TF candidates that could be key regulators responsible for the variation in the granule cell transcriptome between developmental stages. Altogether, we identified 125 potential TFs (82 from differential expression analysis, 46 from motif analysis with 3 overlaps in the two sets). From this gene set, 37 TFs are considered novel due to the lack of previous knowledge about their roles in cerebellar development. The results from transcriptome-wide analyses were validated with existing online databases, qRT-PCR, and in situ hybridization. This study provides an initial insight into the TFs of cerebellar granule cells that might be important for development and provide valuable information for further functional studies on these transcriptional regulators.


Subject(s)
Cerebellum/embryology , Cerebellum/metabolism , Neurons/metabolism , Transcription Factors/metabolism , Animals , Computer Simulation , Gene Expression Profiling , Gene Expression Regulation, Developmental , In Situ Hybridization , Laser Capture Microdissection , Mice, Inbred C57BL , Real-Time Polymerase Chain Reaction , Transcriptome
19.
Nucleic Acids Res ; 44(7): 3233-52, 2016 Apr 20.
Article in English | MEDLINE | ID: mdl-27001520

ABSTRACT

Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlinc RNAs genes likely function in cisto activate nearby genes. This effect while most pronounced in closely spaced vlinc RNA-gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlinc RNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs.


Subject(s)
RNA, Long Noncoding/genetics , Cell Nucleus/genetics , Embryonic Development/genetics , Gene Expression Regulation , Humans , Insulator Elements , Molecular Sequence Annotation , Promoter Regions, Genetic , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , Retroviridae/genetics , Systems Biology , Terminal Repeat Sequences , Transcription Factors/metabolism
20.
Genome Res ; 24(4): 708-17, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24676093

ABSTRACT

CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences [Roche], Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , RNA/genetics , Transcriptome/genetics , Gene Expression Profiling , Humans , Sequence Analysis, RNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL