Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
1.
PLoS One ; 19(5): e0295971, 2024.
Article in English | MEDLINE | ID: mdl-38709794

ABSTRACT

The human genome is pervasively transcribed and produces a wide variety of long non-coding RNAs (lncRNAs), constituting the majority of transcripts across human cell types. Some specific nuclear lncRNAs have been shown to be important regulatory components acting locally. As RNA-chromatin interaction and Hi-C chromatin conformation data showed that chromatin interactions of nuclear lncRNAs are determined by the local chromatin 3D conformation, we used Hi-C data to identify potential target genes of lncRNAs. RNA-protein interaction data suggested that nuclear lncRNAs act as scaffolds to recruit regulatory proteins to target promoters and enhancers. Nuclear lncRNAs may therefore play a role in directing regulatory factors to locations spatially close to the lncRNA gene. We provide the analysis results through an interactive visualization web portal at https://fantom.gsc.riken.jp/zenbu/reports/#F6_3D_lncRNA.


Subject(s)
Chromatin , RNA, Long Noncoding , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Chromatin/metabolism , Chromatin/genetics , Humans , Molecular Sequence Annotation , Cell Nucleus/metabolism , Cell Nucleus/genetics , Genome, Human , Promoter Regions, Genetic
2.
Geroscience ; 46(2): 2063-2081, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37817005

ABSTRACT

While some old adults stay healthy and non-frail up to late in life, others experience multimorbidity and frailty often accompanied by a pro-inflammatory state. The underlying molecular mechanisms for those differences are still obscure. Here, we used gene expression analysis to understand the molecular underpinning between non-frail and frail individuals in old age. Twenty-four adults (50% non-frail and 50% frail) from InCHIANTI study were included. Total RNA extracted from whole blood was analyzed by Cap Analysis of Gene Expression (CAGE). CAGE identified transcription start site (TSS) and active enhancer regions. We identified a set of differentially expressed (DE) TSS and enhancer between non-frail and frail and male and female participants. Several DE TSSs were annotated as lncRNA (XIST and TTTY14) and antisense RNAs (ZFX-AS1 and OVCH1 Antisense RNA 1). The promoter region chr6:366,786,54-366,787,97;+ was DE and overlapping the longevity CDKN1A gene. GWAS-LD enrichment analysis identifies overlapping LD-blocks with the DE regions with reported traits in GWAS catalog (isovolumetric relaxation time and urinary tract infection frequency). Furthermore, we used weighted gene co-expression network analysis (WGCNA) to identify changes of gene expression associated with clinical traits and identify key gene modules. We performed functional enrichment analysis of the gene modules with significant trait/module correlation. One gene module is showing a very distinct pattern in hub genes. Glycogen Phosphorylase L (PYGL) was the top ranked hub gene between non-frail and frail. We predicted transcription factor binding sites (TFBS) and motif activity. TF involved in age-related pathways (e.g., FOXO3 and MYC) shows different expression patterns between non-frail and frail participants. Expanding the study of OVCH1 Antisense RNA 1 and PYGL may help understand the mechanisms leading to loss of homeostasis that ultimately causes frailty.


Subject(s)
Frailty , RNA, Long Noncoding , Humans , Male , Female , Aged , Frail Elderly , Frailty/genetics , Gene Expression Profiling , RNA, Long Noncoding/genetics , RNA, Antisense/genetics
3.
STAR Protoc ; 4(1): 102038, 2023 03 17.
Article in English | MEDLINE | ID: mdl-36853658

ABSTRACT

SkewC is a single-cell RNA sequencing (scRNA-seq) data quality evaluation tool. The approach is based on determining gene body coverage, and its skewness, as a quality metric for each individual cell. SkewC distinguishes between two types of single cells: typical cells with prototypical gene body coverage profiles and skewed cells with skewed gene body coverage profiles. SkewC can be used on any scRNA-seq data as it is independent from the underlying technology used to generate the data. For complete details on the use and execution of this protocol, please refer to Abugessaisa et al. (2022).1.


Subject(s)
Data Accuracy , Gene Expression Profiling , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Gene Expression Analysis , Single-Cell Analysis/methods
4.
Cell Rep ; 41(13): 111893, 2022 12 27.
Article in English | MEDLINE | ID: mdl-36577377

ABSTRACT

Within the scope of the FANTOM6 consortium, we perform a large-scale knockdown of 200 long non-coding RNAs (lncRNAs) in human induced pluripotent stem cells (iPSCs) and systematically characterize their roles in self-renewal and pluripotency. We find 36 lncRNAs (18%) exhibiting cell growth inhibition. From the knockdown of 123 lncRNAs with transcriptome profiling, 36 lncRNAs (29.3%) show molecular phenotypes. Integrating the molecular phenotypes with chromatin-interaction assays further reveals cis- and trans-interacting partners as potential primary targets. Additionally, cell-type enrichment analysis identifies lncRNAs associated with pluripotency, while the knockdown of LINC02595, CATG00000090305.1, and RP11-148B6.2 modulates colony formation of iPSCs. We compare our results with previously published fibroblasts phenotyping data and find that 2.9% of the lncRNAs exhibit a consistent cell growth phenotype, whereas we observe 58.3% agreement in molecular phenotypes. This highlights that molecular phenotyping is more comprehensive in revealing affected pathways.


Subject(s)
Induced Pluripotent Stem Cells , RNA, Long Noncoding , Humans , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Induced Pluripotent Stem Cells/metabolism , Oligonucleotides, Antisense , Gene Expression Profiling/methods , Embryonic Stem Cells/metabolism
5.
iScience ; 25(2): 103777, 2022 Feb 18.
Article in English | MEDLINE | ID: mdl-35146392

ABSTRACT

The analysis and interpretation of single-cell RNA sequencing (scRNA-seq) experiments are compromised by the presence of poor-quality cells. For meaningful analyses, such poor-quality cells should be excluded as they introduce noise in the data. We introduce SkewC, a quality-assessment tool, to identify skewed cells in scRNA-seq experiments. The tool's methodology is based on the assessment of gene coverage for each cell, and its skewness as a quality measure; the gene body coverage is a unique characteristic for each protocol, and different protocols yield highly different coverage profiles. This tool is designed to avoid misclustering or false clusters by identifying, isolating, and removing cells with skewed gene body coverage profiles. SkewC is capable of processing any type of scRNA-seq dataset, regardless of the protocol. We envision SkewC as a distinctive QC method to be incorporated into scRNA-seq QC processing to preclude the possibility of scRNA-seq data misinterpretation.

6.
Stem Cell Reports ; 17(2): 289-306, 2022 02 08.
Article in English | MEDLINE | ID: mdl-35030321

ABSTRACT

Regenerative medicine relies on basic research outcomes that are only practical when cost effective. The human eyeball requires the retinal pigment epithelium (RPE) to interface the neural retina and the choroid at large. Millions of people suffer from age-related macular degeneration (AMD), a blinding multifactor genetic disease among RPE degradation pathologies. Recently, autologous pluripotent stem-cell-derived RPE cells were prohibitively expensive due to time; therefore, we developed a faster reprogramming system. We stably induced RPE-like cells (iRPE) from human fibroblasts (Fibs) by conditional overexpression of both broad plasticity and lineage-specific transcription factors (TFs). iRPE cells displayed critical RPE benchmarks and significant in vivo integration in transplanted retinas. Herein, we detail the iRPE system with comprehensive single-cell RNA sequencing (scRNA-seq) profiling to interpret and characterize its best cells. We anticipate that our system may enable robust retinal cell induction for basic research and affordable autologous human RPE tissue for regenerative cell therapy.


Subject(s)
Cellular Reprogramming , Fibroblasts/metabolism , Retinal Pigment Epithelium/metabolism , Animals , Cellular Reprogramming/drug effects , Disulfides/pharmacology , Fibroblasts/cytology , Gene Expression Regulation , Humans , Indole Alkaloids/pharmacology , Machine Learning , Niacinamide/pharmacology , Rats , Retina/cytology , Retina/metabolism , Retina/pathology , Retinal Pigment Epithelium/cytology , Retinal Pigment Epithelium/transplantation , Transcription Factors/genetics , Transcription Factors/metabolism
7.
BMC Genom Data ; 22(1): 33, 2021 09 14.
Article in English | MEDLINE | ID: mdl-34521352

ABSTRACT

BACKGROUND: The lymphatic and the blood vasculature are closely related systems that collaborate to ensure the organism's physiological function. Despite their common developmental origin, they present distinct functional fates in adulthood that rely on robust lineage-specific regulatory programs. The recent technological boost in sequencing approaches unveiled long noncoding RNAs (lncRNAs) as prominent regulatory players of various gene expression levels in a cell-type-specific manner. RESULTS: To investigate the potential roles of lncRNAs in vascular biology, we performed antisense oligonucleotide (ASO) knockdowns of lncRNA candidates specifically expressed either in human lymphatic or blood vascular endothelial cells (LECs or BECs) followed by Cap Analysis of Gene Expression (CAGE-Seq). Here, we describe the quality control steps adopted in our analysis pipeline before determining the knockdown effects of three ASOs per lncRNA target on the LEC or BEC transcriptomes. In this regard, we especially observed that the choice of negative control ASOs can dramatically impact the conclusions drawn from the analysis depending on the cellular background. CONCLUSION: In conclusion, the comparison of negative control ASO effects on the targeted cell type transcriptomes highlights the essential need to select a proper control set of multiple negative control ASO based on the investigated cell types.


Subject(s)
Gene Knockdown Techniques/methods , Oligonucleotides, Antisense/genetics , Organ Specificity/genetics , RNA, Long Noncoding/genetics , Adult , Endothelial Cells/metabolism , Gene Knockdown Techniques/standards , Humans , Lymphatic System/cytology , Lymphatic System/metabolism , Oligonucleotides, Antisense/standards , Transcriptome
8.
Geroscience ; 43(3): 1317-1329, 2021 06.
Article in English | MEDLINE | ID: mdl-33599920

ABSTRACT

Phenotype-specific omic expression patterns in people with frailty could provide invaluable insight into the underlying multi-systemic pathological processes and targets for intervention. Classical approaches to frailty have not considered the potential for different frailty phenotypes. We characterized associations between frailty (with/without disability) and sets of omic factors (genomic, proteomic, and metabolomic) plus markers measured in routine geriatric care. This study was a prevalent case control using stored biospecimens (urine, whole blood, cells, plasma, and serum) from 1522 individuals (identified as robust (R), pre-frail (P), or frail (F)] from the Toledo Study of Healthy Aging (R=178/P=184/F=109), 3 City Bordeaux (111/269/100), Aging Multidisciplinary Investigation (157/79/54) and InCHIANTI (106/98/77) cohorts. The analysis included over 35,000 omic and routine laboratory variables from robust and frail or pre-frail (with/without disability) individuals using a machine learning framework. We identified three protective biomarkers, vitamin D3 (OR: 0.81 [95% CI: 0.68-0.98]), lutein zeaxanthin (OR: 0.82 [95% CI: 0.70-0.97]), and miRNA125b-5p (OR: 0.73, [95% CI: 0.56-0.97]) and one risk biomarker, cardiac troponin T (OR: 1.25 [95% CI: 1.23-1.27]). Excluding individuals with a disability, one protective biomarker was identified, miR125b-5p (OR: 0.85, [95% CI: 0.81-0.88]). Three risks of frailty biomarkers were detected: pro-BNP (OR: 1.47 [95% CI: 1.27-1.7]), cardiac troponin T (OR: 1.29 [95% CI: 1.21-1.38]), and sRAGE (OR: 1.26 [95% CI: 1.01-1.57]). Three key frailty biomarkers demonstrated a statistical association with frailty (oxidative stress, vitamin D, and cardiovascular system) with relationship patterns differing depending on the presence or absence of a disability.


Subject(s)
Frailty , Aged , Case-Control Studies , Frail Elderly , Frailty/diagnosis , Humans , Machine Learning , Proteomics
9.
Nucleic Acids Res ; 49(D1): D892-D898, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33211864

ABSTRACT

The Functional ANnoTation Of the Mammalian genome (FANTOM) Consortium has continued to provide extensive resources in the pursuit of understanding the transcriptome, and transcriptional regulation, of mammalian genomes for the last 20 years. To share these resources with the research community, the FANTOM web-interfaces and databases are being regularly updated, enhanced and expanded with new data types. In recent years, the FANTOM Consortium's efforts have been mainly focused on creating new non-coding RNA datasets and resources. The existing FANTOM5 human and mouse miRNA atlas was supplemented with rat, dog, and chicken datasets. The sixth (latest) edition of the FANTOM project was launched to assess the function of human long non-coding RNAs (lncRNAs). From its creation until 2020, FANTOM6 has contributed to the research community a large dataset generated from the knock-down of 285 lncRNAs in human dermal fibroblasts; this is followed with extensive expression profiling and cellular phenotyping. Other updates to the FANTOM resource includes the reprocessing of the miRNA and promoter atlases of human, mouse and chicken with the latest reference genome assemblies. To facilitate the use and accessibility of all above resources we further enhanced FANTOM data viewers and web interfaces. The updated FANTOM web resource is publicly available at https://fantom.gsc.riken.jp/.


Subject(s)
Molecular Sequence Annotation , RNA, Long Noncoding/genetics , Transcriptome/genetics , Animals , Binding Sites , Chromatin/metabolism , Drosophila/genetics , Fibroblasts/cytology , Fibroblasts/metabolism , Genome , Humans , Metadata , Mice , MicroRNAs/genetics , MicroRNAs/metabolism , Promoter Regions, Genetic , RNA, Long Noncoding/metabolism , Transcription Factors/metabolism , User-Computer Interface
10.
Genome Res ; 30(7): 951-961, 2020 07.
Article in English | MEDLINE | ID: mdl-32718981

ABSTRACT

Gene expression profiles in homologous tissues have been observed to be different between species, which may be due to differences between species in the gene expression program in each cell type, but may also reflect differences in cell type composition of each tissue in different species. Here, we compare expression profiles in matching primary cells in human, mouse, rat, dog, and chicken using Cap Analysis Gene Expression (CAGE) and short RNA (sRNA) sequencing data from FANTOM5. While we find that expression profiles of orthologous genes in different species are highly correlated across cell types, in each cell type many genes were differentially expressed between species. Expression of genes with products involved in transcription, RNA processing, and transcriptional regulation was more likely to be conserved, while expression of genes encoding proteins involved in intercellular communication was more likely to have diverged during evolution. Conservation of expression correlated positively with the evolutionary age of genes, suggesting that divergence in expression levels of genes critical for cell function was restricted during evolution. Motif activity analysis showed that both promoters and enhancers are activated by the same transcription factors in different species. An analysis of expression levels of mature miRNAs and of primary miRNAs identified by CAGE revealed that evolutionary old miRNAs are more likely to have conserved expression patterns than young miRNAs. We conclude that key aspects of the regulatory network are conserved, while differential expression of genes involved in cell-to-cell communication may contribute greatly to phenotypic differences between species.


Subject(s)
Evolution, Molecular , Transcriptome , Animals , Chickens/genetics , Dogs , Gene Expression Profiling , Gene Regulatory Networks , Humans , Mice , MicroRNAs/metabolism , Nucleotide Motifs , Principal Component Analysis , Promoter Regions, Genetic , Rats , Species Specificity , Transcription Factors/metabolism
11.
Genome Res ; 30(7): 1060-1072, 2020 07.
Article in English | MEDLINE | ID: mdl-32718982

ABSTRACT

Long noncoding RNAs (lncRNAs) constitute the majority of transcripts in the mammalian genomes, and yet, their functions remain largely unknown. As part of the FANTOM6 project, we systematically knocked down the expression of 285 lncRNAs in human dermal fibroblasts and quantified cellular growth, morphological changes, and transcriptomic responses using Capped Analysis of Gene Expression (CAGE). Antisense oligonucleotides targeting the same lncRNAs exhibited global concordance, and the molecular phenotype, measured by CAGE, recapitulated the observed cellular phenotypes while providing additional insights on the affected genes and pathways. Here, we disseminate the largest-to-date lncRNA knockdown data set with molecular phenotyping (over 1000 CAGE deep-sequencing libraries) for further exploration and highlight functional roles for ZNF213-AS1 and lnc-KHDC3L-2.


Subject(s)
RNA, Long Noncoding/physiology , Cell Growth Processes/genetics , Cell Movement/genetics , Fibroblasts/cytology , Fibroblasts/metabolism , Humans , KCNQ Potassium Channels/metabolism , Molecular Sequence Annotation , Oligonucleotides, Antisense , RNA, Long Noncoding/antagonists & inhibitors , RNA, Long Noncoding/metabolism , RNA, Small Interfering
12.
J Mol Biol ; 431(13): 2407-2422, 2019 06 14.
Article in English | MEDLINE | ID: mdl-31075273

ABSTRACT

Transcription starts at genomic positions called transcription start sites (TSSs), producing RNAs, and is mainly regulated by genomic elements and transcription factors binding around these TSSs. This indicates that TSSs may be a better unit to integrate various data sources related to transcriptional events, including regulation and production of RNAs. However, although several TSS datasets and promoter atlases are available, a comprehensive reference set that integrates all known TSSs is lacking. Thus, we constructed a reference dataset of TSSs (refTSS) for the human and mouse genomes by collecting publicly available TSS annotations and promoter resources, such as FANTOM5, DBTSS, EPDnew, and ENCODE. The data set consists of genomic coordinates of TSS peaks, their gene annotations, quality check results, and conservation between human and mouse. We also developed a web interface to browse the refTSS (http://reftss.clst.riken.jp/). Users can access the resource for collecting and integrating data and information about transcriptional regulation and transcription products.


Subject(s)
Databases, Genetic , Sequence Analysis, DNA/methods , Transcription Initiation Site , Animals , Atlases as Topic , Conserved Sequence , Gene Expression Regulation , Humans , Mice , Molecular Sequence Annotation , Promoter Regions, Genetic
13.
Nat Commun ; 10(1): 360, 2019 01 21.
Article in English | MEDLINE | ID: mdl-30664627

ABSTRACT

Single-cell transcriptomic profiling is a powerful tool to explore cellular heterogeneity. However, most of these methods focus on the 3'-end of polyadenylated transcripts and provide only a partial view of the transcriptome. We introduce C1 CAGE, a method for the detection of transcript 5'-ends with an original sample multiplexing strategy in the C1TM microfluidic system. We first quantifiy the performance of C1 CAGE and find it as accurate and sensitive as other methods in the C1 system. We then use it to profile promoter and enhancer activities in the cellular response to TGF-ß of lung cancer cells and discover subpopulations of cells differing in their response. We also describe enhancer RNA dynamics revealing transcriptional bursts in subsets of cells with transcripts arising from either strand in a mutually exclusive manner, validated using single molecule fluorescence in situ hybridization.


Subject(s)
Enhancer Elements, Genetic , Fibroblasts/metabolism , RNA, Messenger/genetics , Single-Cell Analysis/methods , Transcription Initiation Site , Transcriptome , A549 Cells , Animals , Cell Line , Fibroblasts/cytology , Fibroblasts/drug effects , Gene Expression Profiling , Humans , In Situ Hybridization, Fluorescence , Mice , Microfluidic Analytical Techniques , Promoter Regions, Genetic , RNA, Messenger/metabolism , Sequence Analysis, RNA , Single-Cell Analysis/instrumentation , Transforming Growth Factor beta/pharmacology
14.
Semin Arthritis Rheum ; 48(6): 967-975, 2019 06.
Article in English | MEDLINE | ID: mdl-30420245

ABSTRACT

OBJECTIVES: To evaluate the incidence of anti-drug antibody (ADA) occurrences and ADA-related risk factors under adalimumab and infliximab treatment in rheumatoid arthritis (RA) patients. METHODS: The study combined retrospective cohorts from the ABIRISK project totaling 366 RA patients treated with adalimumab (n = 240) or infliximab (n = 126), 92.4% of them anti-TNF naive (n = 328/355) and 96.6% of them co-treated with methotrexate (n = 341/353) with up to 18 months follow-up. ADA positivity was measured by enzyme-linked immunosorbent assay. The cumulative incidence of ADA was estimated, and potential bio-clinical factors were investigated using a Cox regression model on interval-censored data. RESULTS: ADAs were detected within 18 months in 19.2% (n = 46) of the adalimumab-treated patients and 29.4% (n = 37) of the infliximab-treated patients. The cumulative incidence of ADA increased over time. In the adalimumab and infliximab groups, respectively, the incidence was 15.4% (5.2-20.2) and 0% (0-5.9) at 3 months, 17.6% (11.4-26.4) and 0% (0-25.9) at 6 months, 17.7% (12.6-37.5) and 34.1% (11.4-46.3) at 12 months, 50.0% (25.9-87.5) and 37.5% (25.9-77.4) at 15 months and 50.0% (25.9-87.5) and 66.7% (37.7-100) at 18 months. Factors associated with a higher risk of ADA development were: longer disease duration (1-3 vs. < 1 year; adalimumab: HR 3.0, 95% CI 1.0-8.7; infliximab: HR 2.7, 95% CI 1.1-6.8), moderate disease activity (DAS28 3.2-5.1 vs. < 3.2; adalimumab: HR 6.6, 95% CI 1.3-33.7) and lifetime smoking (infliximab: HR 2.7, 95% CI 1.2-6.3). CONCLUSIONS: The current study focusing on patients co-treated with methotrexate for more than 95% of them found a late occurrence of ADAs not previously observed, whereby the risk continued to increase over 18 months. Disease duration, DAS28 and lifetime smoking are clinical predictors of ADA development.


Subject(s)
Adalimumab/immunology , Antibodies , Antirheumatic Agents/immunology , Arthritis, Rheumatoid/drug therapy , Infliximab/immunology , Adalimumab/therapeutic use , Adult , Aged , Antirheumatic Agents/therapeutic use , Arthritis, Rheumatoid/immunology , Drug Therapy, Combination , Female , Humans , Infliximab/therapeutic use , Male , Methotrexate/therapeutic use , Middle Aged , Retrospective Studies , Risk Factors
15.
Nucleic Acids Res ; 47(D1): D752-D758, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30407557

ABSTRACT

The FANTOM web resource (http://fantom.gsc.riken.jp/) was developed to provide easy access to the data produced by the FANTOM project. It contains the most complete and comprehensive sets of actively transcribed enhancers and promoters in the human and mouse genomes. We determined the transcription activities of these regulatory elements by CAGE (Cap Analysis of Gene Expression) for both steady and dynamic cellular states in all major and some rare cell types, consecutive stages of differentiation and responses to stimuli. We have expanded the resource by employing different assays, such as RNA-seq, short RNA-seq and a paired-end protocol for CAGE (CAGEscan), to provide new angles to study the transcriptome. That yielded additional atlases of long noncoding RNAs, miRNAs and their promoters. We have also expanded the CAGE analysis to cover rat, dog, chicken, and macaque species for a limited number of cell types. The CAGE data obtained from human and mouse were reprocessed to make them available on the latest genome assemblies. Here, we report the recent updates of both data and interfaces in the FANTOM web resource.


Subject(s)
Databases, Genetic , Genome/genetics , Internet , Transcriptome/genetics , Animals , Cell Differentiation/genetics , Chickens/genetics , Dogs , Gene Expression Regulation/genetics , Genomics/trends , Humans , Mice , MicroRNAs/genetics , Promoter Regions, Genetic/genetics , RNA, Long Noncoding/genetics , Rats , User-Computer Interface
16.
Sci Data ; 5(1): 2, 2018 12 11.
Article in English | MEDLINE | ID: mdl-30538238

ABSTRACT

The authors regret that Luba M. Pardo was omitted in error from the author list of the original version of this Data Descriptor. This omission has now been corrected in the HTML and PDF versions. The authors also regret that Anemieke Rozemuller was omitted in error from the Acknowledgements of the original version of this Data Descriptor. This omission has now been corrected in the HTML and PDF versions.

17.
Nucleic Acids Res ; 46(D1): D781-D787, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29045713

ABSTRACT

Published single-cell datasets are rich resources for investigators who want to address questions not originally asked by the creators of the datasets. The single-cell datasets might be obtained by different protocols and diverse analysis strategies. The main challenge in utilizing such single-cell data is how we can make the various large-scale datasets to be comparable and reusable in a different context. To challenge this issue, we developed the single-cell centric database 'SCPortalen' (http://single-cell.clst.riken.jp/). The current version of the database covers human and mouse single-cell transcriptomics datasets that are publicly available from the INSDC sites. The original metadata was manually curated and single-cell samples were annotated with standard ontology terms. Following that, common quality assessment procedures were conducted to check the quality of the raw sequence. Furthermore, primary data processing of the raw data followed by advanced analyses and interpretation have been performed from scratch using our pipeline. In addition to the transcriptomics data, SCPortalen provides access to single-cell image files whenever available. The target users of SCPortalen are all researchers interested in specific cell types or population heterogeneity. Through the web interface of SCPortalen users are easily able to search, explore and download the single-cell datasets of their interests.


Subject(s)
Databases, Genetic , Datasets as Topic , Mice/genetics , Single-Cell Analysis , Transcriptome , Animals , Data Accuracy , Data Curation , Gene Expression , Gene Ontology , Humans , Molecular Sequence Annotation , User-Computer Interface , Workflow
18.
Sci Data ; 4: 170173, 2017 11 28.
Article in English | MEDLINE | ID: mdl-29182598

ABSTRACT

The promoter landscape of several non-human model organisms is far from complete. As a part of FANTOM5 data collection, we generated 13 profiles of transcription initiation activities in dog and rat aortic smooth muscle cells, mesenchymal stem cells and hepatocytes by employing CAGE (Cap Analysis of Gene Expression) technology combined with single molecule sequencing. Our analyses show that the CAGE profiles recapitulate known transcription start sites (TSSs) consistently, in addition to uncover novel TSSs. Our dataset can be thus used with high confidence to support gene annotation in dog and rat species. We identified 28,497 and 23,147 CAGE peaks, or promoter regions, for rat and dog respectively, and associated them to known genes. This approach could be seen as a standard method for improvement of existing gene models, as well as discovery of novel genes. Given that the FANTOM5 data collection includes dog and rat matched cell types in human and mouse as well, this data would also be useful for cross-species studies.


Subject(s)
Transcription, Genetic , Animals , Dogs , Molecular Sequence Annotation , Promoter Regions, Genetic , Rats , Transcription Initiation Site
19.
Sci Data ; 4: 170163, 2017 10 31.
Article in English | MEDLINE | ID: mdl-29087374

ABSTRACT

Rhesus macaque was the second non-human primate whose genome has been fully sequenced and is one of the most used model organisms to study human biology and disease, thanks to the close evolutionary relationship between the two species. But compared to human, where several previously unknown RNAs have been uncovered, the macaque transcriptome is less studied. Publicly available RNA expression resources for macaque are limited, even for brain, which is highly relevant to study human cognitive abilities. In an effort to complement those resources, FANTOM5 profiled 15 distinct anatomical regions of the aged macaque central nervous system using Cap Analysis of Gene Expression, a high-resolution, annotation-independent technology that allows monitoring of transcription initiation events with high accuracy. We identified 25,869 CAGE peaks, representing bona fide promoters. For each peak we provide detailed annotation, expanding the landscape of 'known' macaque genes, and we show concrete examples on how to use the resulting data. We believe this data represents a useful resource to understand the central nervous system in macaque.


Subject(s)
Central Nervous System , Macaca mulatta , Transcription Initiation Site , Animals , Central Nervous System/anatomy & histology , Transcriptome
20.
Sci Data ; 4: 170147, 2017 10 03.
Article in English | MEDLINE | ID: mdl-28972578

ABSTRACT

The FANTOM5 expression atlas is a quantitative measurement of the activity of nearly 200,000 promoter regions across nearly 2,000 different human primary cells, tissue types and cell lines. Generation of this atlas was made possible by the use of CAGE, an experimental approach to localise transcription start sites at single-nucleotide resolution by sequencing the 5' ends of capped RNAs after their conversion to cDNAs. While 50% of CAGE-defined promoter regions could be confidently associated to adjacent transcriptional units, nearly 100,000 promoter regions remained gene-orphan. To address this, we used the CAGEscan method, in which random-primed 5'-cDNAs are paired-end sequenced. Pairs starting in the same region are assembled in transcript models called CAGEscan clusters. Here, we present the production and quality control of CAGEscan libraries from 56 FANTOM5 RNA sources, which enhances the FANTOM5 expression atlas by providing experimental evidence associating core promoter regions with their cognate transcripts.


Subject(s)
Promoter Regions, Genetic , Transcription, Genetic , DNA, Complementary , Humans , Organ Specificity , Sequence Analysis, RNA , Transcription Initiation Site
SELECTION OF CITATIONS
SEARCH DETAIL
...