Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 50
Filter
1.
Elife ; 122024 May 16.
Article in English | MEDLINE | ID: mdl-38752723

ABSTRACT

A causal relationship exists among the aging process, organ decay and disfunction, and the occurrence of various diseases including cancer. A genetically engineered mouse model, termed Klf1K74R/K74R or Klf1(K74R), carrying mutation on the well-conserved sumoylation site of the hematopoietic transcription factor KLF1/EKLF has been generated that possesses extended lifespan and healthy characteristics, including cancer resistance. We show that the healthy longevity characteristics of the Klf1(K74R) mice, as exemplified by their higher anti-cancer capability, are likely gender-, age-, and genetic background-independent. Significantly, the anti-cancer capability, in particular that against melanoma as well as hepatocellular carcinoma, and lifespan-extending property of Klf1(K74R) mice, could be transferred to wild-type mice via transplantation of their bone marrow mononuclear cells at a young age of the latter. Furthermore, NK(K74R) cells carry higher in vitro cancer cell-killing ability than wild-type NK cells. Targeted/global gene expression profiling analysis has identified changes in the expression of specific proteins, including the immune checkpoint factors PDCD and CD274, and cellular pathways in the leukocytes of the Klf1(K74R) that are in the directions of anti-cancer and/or anti-aging. This study demonstrates the feasibility of developing a transferable hematopoietic/blood system for long-term anti-cancer and, potentially, for anti-aging.


Subject(s)
Kruppel-Like Transcription Factors , Longevity , Animals , Kruppel-Like Transcription Factors/genetics , Kruppel-Like Transcription Factors/metabolism , Mice , Longevity/genetics , Killer Cells, Natural/immunology , Neoplasms/genetics , Genetic Engineering , Bone Marrow Transplantation , Female , Gene Expression Profiling , Male , Mice, Transgenic
2.
Acta Neuropathol Commun ; 12(1): 77, 2024 May 18.
Article in English | MEDLINE | ID: mdl-38762464

ABSTRACT

Glioblastoma (GBM) is the most common malignant brain tumor in adults, which remains incurable and often recurs rapidly after initial therapy. While large efforts have been dedicated to uncover genomic/transcriptomic alternations associated with the recurrence of GBMs, the evolutionary trajectories of matched pairs of primary and recurrent (P-R) GBMs remain largely elusive. It remains challenging to identify genes associated with time to relapse (TTR) and construct a stable and effective prognostic model for predicting TTR of primary GBM patients. By integrating RNA-sequencing and genomic data from multiple datasets of patient-matched longitudinal GBMs of isocitrate dehydrogenase wild-type (IDH-wt), here we examined the associations of TTR with heterogeneities between paired P-R GBMs in gene expression profiles, tumor mutation burden (TMB), and microenvironment. Our results revealed a positive correlation between TTR and transcriptomic/genomic differences between paired P-R GBMs, higher percentages of non-mesenchymal-to-mesenchymal transition and mesenchymal subtype for patients with a short TTR than for those with a long TTR, a high correlation between paired P-R GBMs in gene expression profiles and TMB, and a negative correlation between the fitting level of such a paired P-R GBM correlation and TTR. According to these observations, we identified 55 TTR-associated genes and thereby constructed a seven-gene (ZSCAN10, SIGLEC14, GHRHR, TBX15, TAS2R1, CDKL1, and CD101) prognostic model for predicting TTR of primary IDH-wt GBM patients using univariate/multivariate Cox regression analyses. The risk scores estimated by the model were significantly negatively correlated with TTR in the training set and two independent testing sets. The model also segregated IDH-wt GBM patients into two groups with significantly divergent progression-free survival outcomes and showed promising performance for predicting 1-, 2-, and 3-year progression-free survival rates in all training and testing sets. Our findings provide new insights into the molecular understanding of GBM progression at recurrence and potential targets for therapeutic treatments.


Subject(s)
Brain Neoplasms , Glioblastoma , Isocitrate Dehydrogenase , Neoplasm Recurrence, Local , Transcriptome , Humans , Glioblastoma/genetics , Glioblastoma/pathology , Isocitrate Dehydrogenase/genetics , Brain Neoplasms/genetics , Brain Neoplasms/pathology , Neoplasm Recurrence, Local/genetics , Male , Female , Genomics/methods , Mutation , Middle Aged , Time Factors
3.
Nucleic Acids Res ; 52(D1): D115-D123, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37823705

ABSTRACT

Circular RNAs (circRNAs) are RNA molecules with a continuous loop structure characterized by back-splice junctions (BSJs). While analyses of short-read RNA sequencing have identified millions of BSJ events, it is inherently challenging to determine exact full-length sequences and alternatively spliced (AS) isoforms of circRNAs. Recent advances in nanopore long-read sequencing with circRNA enrichment bring an unprecedented opportunity for investigating the issues. Here, we developed FL-circAS (https://cosbi.ee.ncku.edu.tw/FL-circAS/), which collected such long-read sequencing data of 20 cell lines/tissues and thereby identified 884 636 BSJs with 1 853 692 full-length circRNA isoforms in human and 115 173 BSJs with 135 617 full-length circRNA isoforms in mouse. FL-circAS also provides multiple circRNA features. For circRNA expression, FL-circAS calculates expression levels for each circRNA isoform, cell line/tissue specificity at both the BSJ and isoform levels, and AS entropy for each BSJ across samples. For circRNA biogenesis, FL-circAS identifies reverse complementary sequences and RNA binding protein (RBP) binding sites residing in flanking sequences of BSJs. For functional patterns, FL-circAS identifies potential microRNA/RBP binding sites and several types of evidence for circRNA translation on each full-length circRNA isoform. FL-circAS provides user-friendly interfaces for browsing, searching, analyzing, and downloading data, serving as the first resource for discovering full-length circRNAs at the isoform level.


Subject(s)
Databases, Nucleic Acid , RNA, Circular , Animals , Humans , Mice , Alternative Splicing/genetics , MicroRNAs/genetics , MicroRNAs/metabolism , Nanopore Sequencing , RNA, Circular/genetics , RNA Isoforms/genetics
4.
Nat Methods ; 20(8): 1159-1169, 2023 08.
Article in English | MEDLINE | ID: mdl-37443337

ABSTRACT

The detection of circular RNA molecules (circRNAs) is typically based on short-read RNA sequencing data processed using computational tools. Numerous such tools have been developed, but a systematic comparison with orthogonal validation is missing. Here, we set up a circRNA detection tool benchmarking study, in which 16 tools detected more than 315,000 unique circRNAs in three deeply sequenced human cell types. Next, 1,516 predicted circRNAs were validated using three orthogonal methods. Generally, tool-specific precision is high and similar (median of 98.8%, 96.3% and 95.5% for qPCR, RNase R and amplicon sequencing, respectively) whereas the sensitivity and number of predicted circRNAs (ranging from 1,372 to 58,032) are the most significant differentiators. Of note, precision values are lower when evaluating low-abundance circRNAs. We also show that the tools can be used complementarily to increase detection sensitivity. Finally, we offer recommendations for future circRNA detection and validation.


Subject(s)
Benchmarking , RNA, Circular , Humans , RNA, Circular/genetics , RNA/genetics , RNA/metabolism , Sequence Analysis, RNA/methods
5.
Nucleic Acids Res ; 51(15): 7777-7797, 2023 08 25.
Article in English | MEDLINE | ID: mdl-37497782

ABSTRACT

Trans-spliced RNAs (ts-RNAs) are a type of non-co-linear (NCL) transcripts that consist of exons in an order topologically inconsistent with the corresponding DNA template. Detecting ts-RNAs is often interfered by experimental artifacts, circular RNAs (circRNAs) and genetic rearrangements. Particularly, intragenic ts-RNAs, which are derived from separate precursor mRNA molecules of the same gene, are often mistaken for circRNAs through analyses of RNA-seq data. Here we developed a bioinformatics pipeline (NCLscan-hybrid), which integrated short and long RNA-seq reads to minimize false positives and proposed out-of-circle and rolling-circle long reads to distinguish between intragenic ts-RNAs and circRNAs. Combining NCLscan-hybrid screening and multiple experimental validation steps successfully confirmed that four NCL events, which were previously regarded as circRNAs in databases, originated from trans-splicing. CRISPR-based endogenous genome modification experiments further showed that flanking intronic complementary sequences can significantly contribute to ts-RNA formation, providing an efficient/specific method to deplete ts-RNAs. We also experimentally validated that one ts-RNA (ts-ARFGEF1) played an important role for p53-mediated apoptosis through affecting the PERK/eIF2a/ATF4/CHOP signaling pathway in breast cancer cells. This study thus described both bioinformatics procedures and experimental validation steps for rigorous characterization of ts-RNAs, expanding future studies for identification, biogenesis, and function of these important but understudied transcripts.


Subject(s)
Sequence Analysis, RNA , Trans-Splicing , Genome , RNA Splicing , RNA, Circular , Sequence Analysis, RNA/methods
6.
Life Sci Alliance ; 6(5)2023 05.
Article in English | MEDLINE | ID: mdl-36849251

ABSTRACT

Circular RNAs (circRNAs) are non-polyadenylated RNAs with a continuous loop structure characterized by a non-colinear back-splice junction (BSJ). Although millions of circRNA candidates have been identified, it remains a major challenge for determining circRNA reliability because of various types of false positives. Here, we systematically assess the impacts of numerous factors related to circRNA identification, conservation, biogenesis, and function on circRNA reliability by comparisons of circRNA expression from mock and the corresponding colinear/polyadenylated RNA-depleted datasets based on three different RNA treatment approaches. Eight important indicators of circRNA reliability are determined. The relative contribution to variability explained analyses reveal that the relative importance of these factors in affecting circRNA reliability in descending order is the conservation level of circRNA, full-length circular sequences, supporting BSJ read count, both BSJ donor and acceptor splice sites at the same colinear transcript isoforms, both BSJ donor and acceptor splice sites at the annotated exon boundaries, BSJs detected by multiple tools, supporting functional features, and both BSJ donor and acceptor splice sites undergoing alternative splicing. This study thus provides a useful guideline and an important resource for selecting high-confidence circRNAs for further investigations.


Subject(s)
RNA, Circular , RNA , RNA, Circular/genetics , Reproducibility of Results , RNA/genetics , Alternative Splicing/genetics , Exons/genetics
7.
Mol Psychiatry ; 27(11): 4695-4706, 2022 Nov.
Article in English | MEDLINE | ID: mdl-35962193

ABSTRACT

Genetic risk variants and transcriptional expression changes in autism spectrum disorder (ASD) were widely investigated, but their causal relationship remains largely unknown. Circular RNAs (circRNAs) are abundant in brain and often serve as upstream regulators of mRNAs. By integrating RNA-sequencing with genotype data from autistic brains, we assessed expression quantitative trait loci of circRNAs (circQTLs) that cis-regulated expression of nearby circRNAs and trans-regulated expression of distant genes (trans-eGenes) simultaneously. We thus identified 3619 circQTLs that were also trans-eQTLs and constructed 19,804 circQTL-circRNA-trans-eGene regulatory axes. We conducted two different types of approaches, mediation and partial correlation tests (MPT), to determine the axes with mediation effects of circQTLs on trans-eGene expression through circRNA expression. We showed that the mediation effects of the circQTLs (trans-eQTLs) on circRNA expression were positively correlated with the magnitude of circRNA-trans-eGene correlation of expression profile. The positive correlation became more significant after adjustment for the circQTLs. Of the 19,804 axes, 8103 passed MPT. Meanwhile, we performed causal inference test (CIT) and identified 2070 circQTL-trans-eGene-ASD diagnosis propagation paths. We showed that the CIT-passing genes were significantly enriched for ASD risk genes, genes encoding postsynaptic density proteins, and other ASD-relevant genes, supporting the relevance of the CIT-passing genes to ASD pathophysiology. Integration of MPT- and CIT-passing axes further constructed 352 circQTL-circRNA-trans-eGene-ASD diagnosis propagation paths, wherein the circRNA-trans-eGene axes may act as causal mediators for the circQTL-ASD diagnosis associations. These analyses were also successfully applied to an independent dataset from schizophrenia brains. Collectively, this study provided the first framework for systematically investigating trans-genetic effects of circQTLs and inferring the corresponding causal relations in diseases. The identified circQTL-circRNA-trans-eGene regulatory interactions, particularly the internal modules that were previously implicated in the examined disorders, also provided a helpful dataset for further investigating causative biology and cryptic regulatory mechanisms underlying the neuropsychiatric diseases.


Subject(s)
Autism Spectrum Disorder , MicroRNAs , Humans , RNA, Circular/genetics , Autism Spectrum Disorder/genetics , Quantitative Trait Loci/genetics , RNA, Messenger/genetics , Sequence Analysis, RNA , MicroRNAs/genetics , Gene Expression Profiling , Gene Regulatory Networks , RNA/genetics
8.
BMC Bioinformatics ; 23(1): 164, 2022 May 06.
Article in English | MEDLINE | ID: mdl-35524165

ABSTRACT

BACKGROUND: Circular RNAs (circRNAs) are a class of non-coding RNAs formed by pre-mRNA back-splicing, which are widely expressed in animal/plant cells and often play an important role in regulating microRNA (miRNA) activities. While numerous databases have collected a large amount of predicted circRNA candidates and provided the corresponding circRNA-regulated interactions, a stand-alone package for constructing circRNA-miRNA-mRNA interactions based on user-identified circRNAs across species is lacking. RESULTS: We present CircMiMi (circRNA-miRNA-mRNA interactions), a modular, Python-based software to identify circRNA-miRNA-mRNA interactions across 18 species (including 16 animals and 2 plants) with the given coordinates of circRNA junctions. The CircMiMi-constructed circRNA-miRNA-mRNA interactions are derived from circRNA-miRNA and miRNA-mRNA axes with the support of computational predictions and/or experimental data. CircMiMi also allows users to examine alignment ambiguity of back-splice junctions for checking circRNA reliability and examine reverse complementary sequences residing in the sequences flanking the circularized exons for investigating circRNA formation. We further employ CircMiMi to identify circRNA-miRNA-mRNA interactions based on the circRNAs collected in NeuroCirc, a large-scale database of circRNAs in the human brain. We construct circRNA-miRNA-mRNA interactions comprising differentially expressed circRNAs, and miRNAs in autism spectrum disorder (ASD) and cross-species analyze the relevance of the targets to ASD. We thus provide a rich set of ASD-associated circRNA-miRNA-mRNA axes and a useful starting point for investigation of regulatory mechanisms in ASD pathophysiology. CONCLUSIONS: CircMiMi allows users to identify circRNA-mediated interactions in multiple species, shedding light on regulatory roles of circRNAs. The software package and web interface are freely available at https://github.com/TreesLab/CircMiMi and http://circmimi.genomics.sinica.edu.tw/ , respectively.


Subject(s)
Autism Spectrum Disorder , MicroRNAs , Animals , Gene Regulatory Networks , MicroRNAs/genetics , RNA, Circular , RNA, Messenger/genetics , Reproducibility of Results , Software
9.
Cells ; 10(11)2021 11 10.
Article in English | MEDLINE | ID: mdl-34831338

ABSTRACT

The developmental potential within pluripotent cells in the canonical model is restricted to embryonic tissues, whereas totipotent cells can differentiate into both embryonic and extraembryonic tissues. Currently, the ability to culture in vitro totipotent cells possessing molecular and functional features like those of an early embryo in vivo has been a challenge. Recently, it was reported that treatment with a single spliceosome inhibitor, pladienolide B (plaB), can successfully reprogram mouse pluripotent stem cells into totipotent blastomere-like cells (TBLCs) in vitro. The TBLCs exhibited totipotency transcriptionally and acquired expanded developmental potential with the ability to yield various embryonic and extraembryonic tissues that may be employed as novel mouse developmental cell models. However, it is disputed whether TBLCs are 'true' totipotent stem cells equivalent to in vivo two-cell stage embryos. To address this question, single-cell RNA sequencing was applied to TBLCs and cells from early mouse embryonic developmental stages and the data were integrated using canonical correlation analyses. Differential expression analyses were performed between TBLCs and multi-embryonic cell stages to identify differentially expressed genes. Remarkably, a subpopulation within the TBLCs population expressed a high level of the totipotent-related genes Zscan4s and displayed transcriptomic features similar to mouse two-cell stage embryonic cells. This study underscores the subtle differences between in vitro derived TBLCs and in vivo mouse early developmental cell stages at the single-cell transcriptomic level. Our study has identified a new experimental model for stem cell biology, namely 'cluster 3', as a subpopulation of TBLCs that can be molecularly defined as near totipotent cells.


Subject(s)
Blastomeres/cytology , Embryo, Mammalian/cytology , Mouse Embryonic Stem Cells/cytology , Single-Cell Analysis , Totipotent Stem Cells/cytology , Transcriptome/genetics , Animals , Cluster Analysis , Gene Expression Regulation , Gene Ontology , Mice , Pluripotent Stem Cells/cytology , Pluripotent Stem Cells/metabolism , Signal Transduction , Zygote/metabolism
10.
Genome Res ; 30(3): 375-391, 2020 03.
Article in English | MEDLINE | ID: mdl-32127416

ABSTRACT

Circular RNAs (circRNAs), a class of long noncoding RNAs, are known to be enriched in mammalian neural tissues. Although a wide range of dysregulation of gene expression in autism spectrum disorder (ASD) have been reported, the role of circRNAs in ASD remains largely unknown. Here, we performed genome-wide circRNA expression profiling in postmortem brains from individuals with ASD and controls and identified 60 circRNAs and three coregulated modules that were perturbed in ASD. By integrating circRNA, microRNA, and mRNA dysregulation data derived from the same cortex samples, we identified 8170 ASD-associated circRNA-microRNA-mRNA interactions. Putative targets of the axes were enriched for ASD risk genes and genes encoding inhibitory postsynaptic density (PSD) proteins, but not for genes implicated in monogenetic forms of other brain disorders or genes encoding excitatory PSD proteins. This reflects the previous observation that ASD-derived organoids show overproduction of inhibitory neurons. We further confirmed that some ASD risk genes (NLGN1, STAG1, HSD11B1, VIP, and UBA6) were regulated by an up-regulated circRNA (circARID1A) via sponging a down-regulated microRNA (miR-204-3p) in human neuronal cells. Particularly, alteration of NLGN1 expression is known to affect the dynamic processes of memory consolidation and strengthening. To the best of our knowledge, this is the first systems-level view of circRNA regulatory networks in ASD cortex samples. We provided a rich set of ASD-associated circRNA candidates and the corresponding circRNA-microRNA-mRNA axes, particularly those involving ASD risk genes. Our findings thus support a role for circRNA dysregulation and the corresponding circRNA-microRNA-mRNA axes in ASD pathophysiology.


Subject(s)
Autism Spectrum Disorder/genetics , Gene Expression Regulation , MicroRNAs/metabolism , RNA, Circular/metabolism , RNA, Messenger/metabolism , Astrocytes/metabolism , Autism Spectrum Disorder/metabolism , Brain/metabolism , Cell Line , Genome, Human , Humans , Neural Stem Cells/metabolism , Neurons/metabolism
11.
Genome Res ; 29(11): 1766-1776, 2019 11.
Article in English | MEDLINE | ID: mdl-31515285

ABSTRACT

Adenosine-to-inosine (A-to-I) RNA editing is a very common co-/posttranscriptional modification that can lead to A-to-G changes at the RNA level and compensate for G-to-A genomic changes to a certain extent. It has been shown that each healthy individual can carry dozens of missense variants predicted to be severely deleterious. Why strongly detrimental variants are preserved in a population and not eliminated by negative natural selection remains mostly unclear. Here, we ask if RNA editing correlates with the burden of deleterious A/G polymorphisms in a population. Integrating genome and transcriptome sequencing data from 447 human lymphoblastoid cell lines, we show that nonsynonymous editing activities (prevalence/level) are negatively correlated with the deleteriousness of A-to-G genomic changes and positively correlated with that of G-to-A genomic changes within the population. We find a significantly negative correlation between nonsynonymous editing activities and allele frequency of A within the population. This negative editing-allele frequency correlation is particularly strong when editing sites are located in highly important genes/loci. Examinations of deleterious missense variants from the 1000 Genomes Project further show a significantly higher proportion of rare missense mutations for G-to-A changes than for other types of changes. The proportion for G-to-A changes increases with increasing deleterious effects of the changes. Moreover, the deleteriousness of G-to-A changes is significantly positively correlated with the percentage of editing enzyme binding motifs at the variants. Overall, we show that nonsynonymous editing is associated with the increased burden of G-to-A missense mutations in healthy individuals, expanding RNA editing in pathogenomics studies.


Subject(s)
Adenosine/genetics , Inosine/genetics , Mutation, Missense , RNA Editing , RNA/genetics , Gene Frequency , Humans
12.
13.
Acta Neuropathol Commun ; 7(1): 50, 2019 03 29.
Article in English | MEDLINE | ID: mdl-30922385

ABSTRACT

TAR DNA-binding protein (TDP-43) is a ubiquitously expressed nuclear protein, which participates in a number of cellular processes and has been identified as the major pathological factor in amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD). Here we constructed a conditional TDP-43 mouse with depletion of TDP-43 in the mouse forebrain and find that the mice exhibit a whole spectrum of age-dependent frontotemporal dementia-like behaviour abnormalities including perturbation of social behaviour, development of dementia-like behaviour, changes of activities of daily living, and memory loss at a later stage of life. These variations are accompanied with inflammation, neurodegeneration, and abnormal synaptic plasticity of the mouse CA1 neurons. Importantly, analysis of the cortical RNA transcripts of the conditional knockout mice at the pre-/post-symptomatic stages and the corresponding wild type mice reveals age-dependent alterations in the expression levels and RNA processing patterns of a set of genes closely associated with inflammation, social behaviour, synaptic plasticity, and neuron survival. This study not only supports the scenario that loss-of-function of TDP-43 in mice may recapitulate key behaviour features of the FTLD diseases, but also provides a list of TDP-43 target genes/transcript isoforms useful for future therapeutic research.


Subject(s)
DNA-Binding Proteins/deficiency , Frontotemporal Dementia/metabolism , Neurons/metabolism , Prosencephalon/metabolism , Transcriptome/physiology , Age Factors , Animals , DNA-Binding Proteins/genetics , Frontotemporal Dementia/genetics , Frontotemporal Dementia/pathology , Gene Expression Profiling/methods , Mice , Mice, Knockout , Mice, Transgenic , Neurons/pathology , Prosencephalon/pathology
14.
BMC Bioinformatics ; 20(1): 3, 2019 Jan 03.
Article in English | MEDLINE | ID: mdl-30606103

ABSTRACT

BACKGROUND: Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. RESULTS: We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCLscore) based on the variation in the number of supporting NCL junction reads identified by the tools examined. Of the input NCL events, we show that the ambiguous alignment-derived events have relatively lower NCLscore values than the other events, indicating that an NCL event with a higher NCLscore has a higher level of reliability. To help selecting highly expressed NCL events, NCLcomparator also provides a series of useful measurements such as the expression levels of the detected NCL events and their corresponding host genes and the junction usage of the co-linear splice junctions at both NCL donor and acceptor sites. CONCLUSION: NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator .


Subject(s)
Gene Fusion/genetics , Genome/genetics , RNA Splicing/genetics , RNA/genetics , Sequence Analysis, RNA/methods
15.
Nucleic Acids Res ; 46(7): 3671-3691, 2018 04 20.
Article in English | MEDLINE | ID: mdl-29385530

ABSTRACT

Transcriptionally non-co-linear (NCL) transcripts can originate from trans-splicing (trans-spliced RNA; 'tsRNA') or cis-backsplicing (circular RNA; 'circRNA'). While numerous circRNAs have been detected in various species, tsRNAs remain largely uninvestigated. Here, we utilize integrative transcriptome sequencing of poly(A)- and non-poly(A)-selected RNA-seq data from diverse human cell lines to distinguish between tsRNAs and circRNAs. We identified 24,498 NCL events and found that a considerable proportion (20-35%) of them arise from both tsRNAs and circRNAs, representing extensive alternative trans-splicing and cis-backsplicing in human cells. We show that sequence generalities of exon circularization are also observed in tsRNAs. Recapitulation of NCL RNAs further shows that inverted Alu repeats can simultaneously promote the formation of tsRNAs and circRNAs. However, tsRNAs and circRNAs exhibit quite different, or even opposite, expression patterns, in terms of correlation with the expression of their co-linear counterparts, expression breadth/abundance, transcript stability, and subcellular localization preference. These results indicate that tsRNAs and circRNAs may play different regulatory roles and analysis of NCL events should take the joint effects of different NCL-splicing types and joint effects of multiple NCL events into consideration. This study describes the first transcriptome-wide analysis of trans-splicing and cis-backsplicing, expanding our understanding of the complexity of the human transcriptome.


Subject(s)
Alternative Splicing/genetics , RNA/genetics , Trans-Splicing/genetics , Transcriptome/genetics , Exons/genetics , Gene Expression Profiling , Humans , RNA Splicing/genetics , RNA, Circular
16.
Genome Biol Evol ; 10(2): 521-537, 2018 02 01.
Article in English | MEDLINE | ID: mdl-29294013

ABSTRACT

Adenosine-to-inosine (A-to-I) editing is widespread across the kingdom Metazoa. However, for the lack of comprehensive analysis in nonmodel animals, the evolutionary history of A-to-I editing remains largely unexplored. Here, we detect high-confidence editing sites using clustering and conservation strategies based on RNA sequencing data alone, without using single-nucleotide polymorphism information or genome sequencing data from the same sample. We thereby unveil the first evolutionary landscape of A-to-I editing maps across 20 metazoan species (from worm to human), providing unprecedented evidence on how the editing mechanism gradually expands its territory and increases its influence along the history of evolution. Our result revealed that highly clustered and conserved editing sites tended to have a higher editing level and a higher magnitude of the ADAR motif. The ratio of the frequencies of nonsynonymous editing to that of synonymous editing remarkably increased with increasing the conservation level of A-to-I editing. These results thus suggest potentially functional benefit of highly clustered and conserved editing sites. In addition, spatiotemporal dynamics analyses reveal a conserved enrichment of editing and ADAR expression in the central nervous system throughout more than 300 Myr of divergent evolution in complex animals and the comparability of editing patterns between invertebrates and between vertebrates during development. This study provides evolutionary and dynamic aspects of A-to-I editome across metazoan species, expanding this important but understudied class of nongenomically encoded events for comprehensive characterization.


Subject(s)
Adenosine/genetics , Inosine/genetics , RNA Editing , RNA/genetics , Animals , Cluster Analysis , Evolution, Molecular , Humans , Sequence Analysis, RNA
17.
Sci Rep ; 7(1): 7038, 2017 08 01.
Article in English | MEDLINE | ID: mdl-28765567

ABSTRACT

Genomic imprinting is an important epigenetic process that silences one of the parentally-inherited alleles of a gene and thereby exhibits allelic-specific expression (ASE). Detection of human imprinting events is hampered by the infeasibility of the reciprocal mating system in humans and the removal of ASE events arising from non-imprinting factors. Here, we describe a pipeline with the pattern of reciprocal allele descendants (RADs) through genotyping and transcriptome sequencing data across independent parent-offspring trios to discriminate between varied types of ASE (e.g., imprinting, genetic variation-dependent ASE, and random monoallelic expression (RME)). We show that the vast majority of ASE events are due to sequence-dependent genetic variant, which are evolutionarily conserved and may themselves play a cis-regulatory role. Particularly, 74% of non-RAD ASE events, even though they exhibit ASE biases toward the same parentally-inherited allele across different individuals, are derived from genetic variation but not imprinting. We further show that the RME effect may affect the effectiveness of the population-based method for detecting imprinting events and our pipeline can help to distinguish between these two ASE types. Taken together, this study provides a good indicator for categorization of different types of ASE, opening up this widespread and complex mechanism for comprehensive characterization.


Subject(s)
Alleles , Family Health , Gene Expression Profiling/methods , Genetic Variation , Genomic Imprinting , Genotype , Genotyping Techniques/methods , Humans
18.
Sci Rep ; 6: 27272, 2016 06 03.
Article in English | MEDLINE | ID: mdl-27255481

ABSTRACT

Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a "signature" during primate protein evolution.


Subject(s)
Pan troglodytes/genetics , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Animals , Databases, Genetic , Evolution, Molecular , Humans , Mutation Rate , Selection, Genetic
19.
Nucleic Acids Res ; 44(3): e29, 2016 Feb 18.
Article in English | MEDLINE | ID: mdl-26442529

ABSTRACT

Analysis of RNA-seq data often detects numerous 'non-co-linear' (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of false positives arising from alignment artifacts and discrimination between different types of NCL transcripts (trans-spliced, circular or fusion transcripts). Here, we developed a new NCL-transcript-detecting method ('NCLscan'), which utilized a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives, enabling NCLscan outperform 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript. With the high accuracy, NCLscan was applied to distinguishing between trans-spliced, circular and fusion transcripts on the basis of poly(A)- and nonpoly(A)-selected RNA-seq data. We showed that circular RNAs were expressed more ubiquitously, more abundantly and less cell type-specifically than trans-spliced and fusion transcripts. Our study thus describes a robust pipeline for the discovery of NCL transcripts, and sheds light on the fundamental biology of these non-canonical RNA events in human transcriptome.


Subject(s)
RNA Splicing , RNA, Messenger/genetics , RNA/genetics , Limit of Detection , RNA, Circular , Reproducibility of Results
20.
Wiley Interdiscip Rev RNA ; 6(5): 563-79, 2015.
Article in English | MEDLINE | ID: mdl-26230526

ABSTRACT

Circular RNAs (circRNAs) arise during post-transcriptional processes, in which a single-stranded RNA molecule forms a circle through covalent binding. Previously, circRNA products were often regarded to be splicing intermediates, by-products, or products of aberrant splicing. But recently, rapid advances in high-throughput RNA sequencing (RNA-seq) for global investigation of nonco-linear (NCL) RNAs, which comprised sequence segments that are topologically inconsistent with the reference genome, leads to renewed interest in this type of NCL RNA (i.e., circRNA), especially exonic circRNAs (ecircRNAs). Although the biogenesis and function of ecircRNAs are mostly unknown, some ecircRNAs are abundant, highly expressed, or evolutionarily conserved. Some ecircRNAs have been shown to affect microRNA regulation, and probably play roles in regulating parental gene transcription, cell proliferation, and RNA-binding proteins, indicating their functional potential for development as diagnostic tools. To date, thousands of ecircRNAs have been identified in multiple tissues/cell types from diverse species, through analyses of RNA-seq data. However, the detection of ecircRNA candidates involves several major challenges, including discrimination between ecircRNAs and other types of NCL RNAs (e.g., trans-spliced RNAs and genetic rearrangements); removal of sequencing errors, alignment errors, and in vitro artifacts; and the reconciliation of heterogeneous results arising from the use of different bioinformatics methods or sequencing data generated under different treatments. Such challenges may severely hamper the understanding of ecircRNAs. Herein, we review the biogenesis, identification, properties, and function of ecircRNAs, and discuss some unanswered questions regarding ecircRNAs. We also evaluate the accuracy (in terms of sensitivity and precision) of some well-known circRNA-detecting methods.


Subject(s)
Exons , Nucleic Acid Conformation , RNA Processing, Post-Transcriptional/physiology , RNA , Animals , Humans , RNA/genetics , RNA/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...