Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 38
Filter
Add more filters











Publication year range
1.
Comput Struct Biotechnol J ; 23: 3175-3185, 2024 Dec.
Article in English | MEDLINE | ID: mdl-39253057

ABSTRACT

5-formylcytidine (f5C) is a unique post-transcriptional RNA modification found in mRNA and tRNA at the wobble site, playing a crucial role in mitochondrial protein synthesis and potentially contributing to the regulation of translation. Recent studies have unveiled that the f5C modifications may drive mitochondrial mRNA translation to power cancer metastasis. Accurate identification of f5C sites is essential for further unraveling their molecular functions and regulatory mechanisms, but there are currently no computational methods available for predicting their locations. In this study, we introduce an innovative ensemble approach, successfully enabling the computational recognition of Saccharomyces cerevisiae f5C. We conducted a comprehensive model selection process that involved multiple basic machine learning and deep learning algorithms such as recurrent neural networks, convolutional neural networks and Transformer-based models. Initially trained only on sequence information, these individual models achieved an AUROC ranging from 0.7104 to 0.7492. Through the integration of 32 novel domain-derived genomic features, the performance of individual models has significantly improved to an AUROC between 0.7309 and 0.8076. To further enhance accuracy and robustness, we then constructed the ensembles of these individual models with different combinations. The best performance attained by our ensemble models reached an AUROC of 0.8391. Shapley additive explanations were conducted to explain the significant contributions of genomic features, providing insights into the putative distribution of f5C across various topological regions and potentially paving the way for revealing their functional relevance within distinct genomic contexts. A freely accessible web server that allows real-time analysis of user-uploaded sites can be accessed at: www.rnamd.org/Resf5C-Pred.

2.
Mol Carcinog ; 2024 Aug 02.
Article in English | MEDLINE | ID: mdl-39092767

ABSTRACT

Vascular endothelial growth factor A (VEGFA) plays a critical role as a potent angiogenesis factor and is highly expressed in hepatocellular carcinoma (HCC). Although the expression of VEGFA has been strongly linked to the aggressive nature of HCC, the specific posttranscriptional modifications that might contribute to VEGFA expression and HCC angiogenesis are not yet well understood. In this study, we aimed to investigate the epitranscriptome regulation of VEGFA in HCC. A comprehensive analysis integrating MeRIP-seq, RNA-seq, and crosslinking-immunprecipitation-seq data revealed that VEGFA was hypermethylated in HCC and identified the potential m6A regulators of VEGFA including a m6A methyltransferase complex component RBM15 and the two readers, YTHDF2 and IGF2BP3. Through rigorous cell and molecular biology experiments, RBM15 was validated as a key component of methyltransferase complex responsible for m6A methylation of VEGFA, which was subsequently recognized and stabilized by IGF2BP3 and YTHDF2, leading to enhanced VEGFA expression and VEGFA-related functions such as human umbilical vascular endothelial cells (HUVEC) migration and tube formation. In the HCC xenograft model, knockdown of RBM15, IGF2BP3, or YTHDF2 resulted in reduced expression of VEGFA, accompanied by significant inhibition of tumor growth closely associated with VEGFA expression and angiogenesis. Furthermore, our analysis of HCC clinical samples identified positive correlations between the expression levels of VEGFA and the regulators RBM15, IGF2BP3, and YTHDF2. Collectively, these findings offer novel insights into the posttranscriptional modulation of VEGFA and provide potential avenues for alternative approaches to antiangiogenesis therapy targeting VEGFA.

3.
Front Pharmacol ; 15: 1393717, 2024.
Article in English | MEDLINE | ID: mdl-38939838

ABSTRACT

Background: Mesaconitine (MA), a diester-diterpenoid alkaloid extracted from the medicinal herb Aconitum carmichaelii, is commonly used to treat various diseases. Previous studies have indicated the potent toxicity of aconitum despite its pharmacological activities, with limited understanding of its effects on the nervous system and the underlying mechanisms. Methods: HT22 cells and zebrafish were used to investigate the neurotoxic effects of MA both in vitro and in vivo, employing multi-omics techniques to explore the potential mechanisms of toxicity. Results: Our results demonstrated that treatment with MA induces neurotoxicity in zebrafish and HT22 cells. Subsequent analysis revealed that MA induced oxidative stress, as well as structural and functional damage to mitochondria in HT22 cells, accompanied by an upregulation of mRNA and protein expression related to autophagic and lysosomal pathways. Furthermore, methylated RNA immunoprecipitation sequencing (MeRIP-seq) showed a correlation between the expression of autophagy-related genes and N6-methyladenosine (m6A) modification following MA treatment. In addition, we identified METTL14 as a potential regulator of m6A methylation in HT22 cells after exposure to MA. Conclusion: Our study has contributed to a thorough mechanistic elucidation of the neurotoxic effects caused by MA, and has provided valuable insights for optimizing the rational utilization of traditional Chinese medicine formulations containing aconitum in clinical practice.

4.
Chem Biol Interact ; 395: 111036, 2024 May 25.
Article in English | MEDLINE | ID: mdl-38705443

ABSTRACT

Gelsemium elegans Benth. (G. elegans) is a traditional medicinal herb that has anti-inflammatory, analgesic, sedative, and detumescence effects. However, it can also cause intestinal side effects such as abdominal pain and diarrhea. The toxicological mechanisms of gelsenicine are still unclear. The objective of this study was to assess enterotoxicity induced by gelsenicine in the nematodes Caenorhabditis elegans (C. elegans). The nematodes were treated with gelsenicine, and subsequently their growth, development, and locomotion behavior were evaluated. The targets of gelsenicine were predicted using PharmMapper. mRNA-seq was performed to verify the predicted targets. Intestinal permeability, ROS generation, and lipofuscin accumulation were measured. Additionally, the fluorescence intensities of GFP-labeled proteins involved in oxidative stress and unfolded protein response in endoplasmic reticulum (UPRER) were quantified. As a result, the treatment of gelsenicine resulted in the inhibition of nematode lifespan, as well as reductions in body length, width, and locomotion behavior. A total of 221 targets were predicted by PharmMapper, and 731 differentially expressed genes were screened out by mRNA-seq. GO and KEGG enrichment analysis revealed involvement in redox process and transmembrane transport. The permeability assay showed leakage of blue dye from the intestinal lumen into the body cavity. Abnormal mRNAs expression of gem-4, hmp-1, fil-2, and pho-1, which regulated intestinal development, absorption and catabolism, transmembrane transport, and apical junctions, was observed. Intestinal lipofuscin and ROS were increased, while sod-2 and isp-1 expressions were decreased. Multiple proteins in SKN-1/DAF-16 pathway were found to bind stably with gelsenicine in a predictive model. There was an up-regulation in the expression of SKN-1:GFP, while the nuclear translocation of DAF-16:GFP exhibited abnormality. The UPRER biomarker HSP-4:GFP was down-regulated. In conclusion, the treatment of gelsenicine resulted in the increase of nematode intestinal permeability. The toxicological mechanisms underlying this effect involved the disruption of intestinal barrier integrity, an imbalance between oxidative and antioxidant processes mediated by the SKN-1/DAF-16 pathway, and abnormal unfolded protein reaction.


Subject(s)
Caenorhabditis elegans , Reactive Oxygen Species , Animals , Caenorhabditis elegans/drug effects , Caenorhabditis elegans/metabolism , Reactive Oxygen Species/metabolism , Quinoxalines/pharmacology , Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans Proteins/genetics , Oxidative Stress/drug effects , Intestines/drug effects , Intestinal Mucosa/metabolism , Intestinal Mucosa/drug effects , Gelsemium/chemistry , Unfolded Protein Response/drug effects , Permeability/drug effects , Lipofuscin/metabolism , Locomotion/drug effects , Indole Alkaloids
5.
Genes (Basel) ; 15(3)2024 03 09.
Article in English | MEDLINE | ID: mdl-38540406

ABSTRACT

Lipid metabolism participates in various physiological processes and has been shown to be connected to the development and progression of multiple diseases, especially metabolic hepatopathy. Apolipoproteins (Apos) act as vectors that combine with lipids, such as cholesterol and triglycerides (TGs). Despite being involved in lipid transportation and metabolism, the critical role of Apos in the maintenance of lipid metabolism has still not been fully revealed. This study sought to clarify variations related to m6A methylome in ApoF gene knockout mice with disordered lipid metabolism based on the bioinformatics method of transcriptome-wide m6A methylome epitranscriptomics. High-throughput methylated RNA immunoprecipitation sequencing (MeRIP-seq) was conducted in both wild-type (WT) and ApoF knockout (KO) mice. As a result, the liver histopathology presented vacuolization and steatosis, and the serum biochemical assays reported abnormal lipid content in KO mice. The m6A-modified mRNAs were conformed consensus sequenced in eukaryotes, and the distribution was enriched within the coding sequences and 3' non-coding regions. In KO mice, the functional annotation terms of the differentially expressed genes (DEGs) included cholesterol, steroid and lipid metabolism, and lipid storage. In the differentially m6A-methylated mRNAs, the functional annotation terms included cholesterol, TG, and long-chain fatty acid metabolic processes; lipid transport; and liver development. The overlapping DEGs and differential m6A-modified mRNAs were also enriched in terms of lipid metabolism disorder. In conclusion, transcriptome-wide MeRIP sequencing in ApoF KO mice demonstrated the role of this crucial apolipoprotein in liver health and lipid metabolism.


Subject(s)
Adenine , Lipid Metabolism , Transcriptome , Animals , Mice , Adenine/analogs & derivatives , Cholesterol/genetics , Cholesterol/metabolism , Epigenome , Lipid Metabolism/genetics , Liver/metabolism , RNA, Messenger/metabolism , Transcriptome/genetics , Triglycerides/genetics , Triglycerides/metabolism
6.
J Gastrointest Oncol ; 15(1): 203-219, 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38482248

ABSTRACT

Background: Mucinous colonic adenocarcinoma remains a challenging disease due to its high propensity for metastasis and recurrence. N7-methylguanosine (m7G) and long non-coding RNA (lncRNA) are closely associated with the occurrence and progression of tumors. However, research on m7G-related lncRNA in mucinous colonic adenocarcinoma is lacking. Therefore, we sought to explore the prognostic impact of m7G-related lncRNAs in mucinous adenocarcinoma (MC) patients. Methods: In this study, Pearson analysis was used to identify m7G-related lncRNAs from transcriptome data in The Cancer Genome Atlas (TCGA). Univariate Cox regression analysis and least absolute shrinkage and selection operator (LASSO) regression were used to further screen m7G-related lncRNAs and incorporate them into a prognostic signature. Based on the risk model, patients were divided into low- and high-risk groups and randomly assigned to the training set and test sets in a 6:4 ratio. Kaplan-Meier, receiver operating characteristic (ROC) curve, multivariate regression, and nomogram analyses were used to confirm the accuracy of the signature. The CIBERSORT algorithm was used to calculate the degree of immune cell infiltration (ICI). Finally, the correlation of the prognostic signature with tumor mutational burden (TMB) and immunophenotype score (IPS) was evaluated. Results: A total of 432 m7G-related lncRNAs were identified by Pearson analysis. Univariate Cox regression, LASSO regression and survival analysis were performed to further select six m7G-related lncRNAs (P<0.05): AC254629.1, LINC01133, LINC01134, MHENCR, SMIM2-AS1, and XACT. Based on the risk model, heat maps, Kaplan-Meier curves, and ROC curves were constructed, and the results showed that there were significant differences in expression levels and survival status between the two risk groups. The area under the ROC curve (AUC) values for 3-, 5-, and 10-year survival in the training set were 0.944, 0.957, and 1.000, respectively. And in the test set were 0.964, 1.000, and 1.000, respectively. Subsequently, univariate and multivariate regression analyses of clinical characteristics and risk score were performed. The results of risk score were [hazard ratio (HR): 6.458, 95% confidence interval (CI): 2.708-15.403, P<0.001; HR: 7.280, 95% CI: 2.500-21.203, P<0.001], respectively. Using the risk score as an independent prognostic factor, the AUC of it over 3, 5, and 10 years was 0.911, 0.955, and 0.961, respectively. Calibration plots for the nomogram show that the model calibration line is very close to the ideal calibration line, indicating good calibration. The level of ICI was significantly different in the different risk groups. Survival analysis showed that, regardless of TMB risk, patients with MC and a high-risk score consistently had a poor overall survival (OS). Conclusions: The m7G-related lncRNA prognostic signature has potential value for the prognosis of mucinous colonic adenocarcinoma.

7.
Nucleic Acids Res ; 52(D1): D194-D202, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37587690

ABSTRACT

N 6-Methyladenosine (m6A) is one of the most abundant internal chemical modifications on eukaryote mRNA and is involved in numerous essential molecular functions and biological processes. To facilitate the study of this important post-transcriptional modification, we present here m6A-Atlas v2.0, an updated version of m6A-Atlas. It was expanded to include a total of 797 091 reliable m6A sites from 13 high-resolution technologies and two single-cell m6A profiles. Additionally, three methods (exomePeaks2, MACS2 and TRESS) were used to identify >16 million m6A enrichment peaks from 2712 MeRIP-seq experiments covering 651 conditions in 42 species. Quality control results of MeRIP-seq samples were also provided to help users to select reliable peaks. We also estimated the condition-specific quantitative m6A profiles (i.e. differential methylation) under 172 experimental conditions for 19 species. Further, to provide insights into potential functional circuitry, the m6A epitranscriptomics were annotated with various genomic features, interactions with RNA-binding proteins and microRNA, potentially linked splicing events and single nucleotide polymorphisms. The collected m6A sites and their functional annotations can be freely queried and downloaded via a user-friendly graphical interface at: http://rnamd.org/m6a.


Subject(s)
Databases, Genetic , RNA Methylation , RNA, Messenger , Transcriptome , RNA Splicing , RNA, Messenger/chemistry , RNA, Messenger/metabolism , RNA Processing, Post-Transcriptional
8.
Nucleic Acids Res ; 52(D1): D203-D212, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37811871

ABSTRACT

With recent progress in mapping N7-methylguanosine (m7G) RNA methylation sites, tens of thousands of experimentally validated m7G sites have been discovered in various species, shedding light on the significant role of m7G modification in regulating numerous biological processes including disease pathogenesis. An integrated resource that enables the sharing, annotation and customized analysis of m7G data will greatly facilitate m7G studies under various physiological contexts. We previously developed the m7GHub database to host mRNA m7G sites identified in the human transcriptome. Here, we present m7GHub v.2.0, an updated resource for a comprehensive collection of m7G modifications in various types of RNA across multiple species: an m7GDB database containing 430 898 putative m7G sites identified in 23 species, collected from both widely applied next-generation sequencing (NGS) and the emerging Oxford Nanopore direct RNA sequencing (ONT) techniques; an m7GDiseaseDB hosting 156 206 m7G-associated variants (involving addition or removal of an m7G site), including 3238 disease-relevant m7G-SNPs that may function through epitranscriptome disturbance; and two enhanced analysis modules to perform interactive analyses on the collections of m7G sites (m7GFinder) and functional variants (m7GSNPer). We expect that m7Ghub v.2.0 should serve as a valuable centralized resource for studying m7G modification. It is freely accessible at: www.rnamd.org/m7GHub2.


Subject(s)
Databases, Nucleic Acid , High-Throughput Nucleotide Sequencing , RNA Processing, Post-Transcriptional , Humans , Data Interpretation, Statistical , Guanosine/genetics
9.
Biomed Pharmacother ; 169: 115919, 2023 Dec 31.
Article in English | MEDLINE | ID: mdl-37992574

ABSTRACT

Euphorbia factor L1 (EFL1) is a kind of lathyrane-type diterpenoid and is isolated from the medical herb Euphorbia lathyris L. (Euphorbiaceae); it has been reported with the toxicity that causes intestinal irritation, but the underlying mechanisms are still obscure. The objective of this study was to assess the EFL1-induced intestinal cytotoxicity in human colon adenocarcinoma Caco-2 cells. The Caco-2 cells were treated with EFL1, and the intracellular calcium ion concentration, mitochondrial membrane potential (MMP), mitochondrial permeability transition pore (mPTP), adenosine 5'-triphosphate (ATP) content, ATPase activities, TGF-ß1 concentration, and transepithelial electrical resistance (TEER) were detected. The interaction between EFL1 and the tight junction proteins Occludin, Claudin-4, Tricellulin, ZO-1, JAM-1, and E-cadherin was simulated by molecular docking. The expression of proteins involved in the energy metabolism, the ion transporters and aquaporins, the tight junction, and the F-actin cytoskeleton were detected by Western blotting and cell immunofluorescence. As a result, EFL1 decreased the intracellular Ca2+, MMP, mPTP, ATP content, and ATPase activities in the Caco-2 cells. The AMPK/SIRT1/PGC-1α signaling pathway, which regulates the energy metabolism, was inhibited. The ion transporters NEH and CFTR, as well as the aquaporins in the Caco-2 cells, were decreased. The tight junction proteins were down-regulated, and the integrity of the intestinal barrier was injured; TGF-ß1 was compensatively increased; so, the intestinal permeability was increased and was characterized by decreased TEER. The morphology of the F-actin cytoskeleton was destroyed. These findings indicated that EFL1 caused cytotoxicity in the human intestinal Caco-2 cells through mitochondrial damage, inhibition of the energy metabolism, and suppression of the ion and water molecule transporters, as well as the down-regulation tight junction and cytoskeleton protiens.


Subject(s)
Adenocarcinoma , Aquaporins , Colonic Neoplasms , Diterpenes , Humans , Caco-2 Cells , Transforming Growth Factor beta1/metabolism , Molecular Docking Simulation , Adenocarcinoma/metabolism , Colonic Neoplasms/drug therapy , Colonic Neoplasms/metabolism , Diterpenes/pharmacology , Diterpenes/metabolism , Tight Junction Proteins/metabolism , Tight Junctions/metabolism , Energy Metabolism , Adenosine Triphosphate/metabolism , Aquaporins/metabolism , Adenosine Triphosphatases/metabolism , Intestinal Mucosa/metabolism , Permeability
11.
Methods Mol Biol ; 2624: 153-162, 2023.
Article in English | MEDLINE | ID: mdl-36723815

ABSTRACT

Pseudouridine (Ψ) is the first-discovered RNA modification abundantly present in many classes of RNAs, which plays a pivotal role in a series of biological processes. Accurately identifying the location of Ψ sites is helpful for relevant downstream researches. In this chapter, we introduce a website PIANO-for pseudouridine site (Ψ) identification and functional annotation, which enables researchers to predict human putative Ψ sites with a high-accuracy (average AUC of 0.955 under the full transcript model and 0.838 under the mature mRNA model when testing on six independent datasets). The posttranscriptional regulatory mechanisms of putative Ψ sites including miRNA-targets, RBP-binding regions, and splicing sites were also annotated. A comprehensive query database was also provided to deposit over 4300 human Ψ modifications, which is currently the most complete collection of experimental-derived Ψ sites. The PIANO website is freely accessible at: http://piano.rnamd.com or http://180.208.58.19/Ψ-WHISTLE .


Subject(s)
MicroRNAs , Pseudouridine , Humans , Pseudouridine/genetics , RNA, Messenger/genetics , Sequence Analysis, RNA , RNA Splicing , RNA Processing, Post-Transcriptional
12.
Nucleic Acids Res ; 51(D1): D106-D116, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36382409

ABSTRACT

With advanced technologies to map RNA modifications, our understanding of them has been revolutionized, and they are seen to be far more widespread and important than previously thought. Current next-generation sequencing (NGS)-based modification profiling methods are blind to RNA modifications and thus require selective chemical treatment or antibody immunoprecipitation methods for particular modification types. They also face the problem of short read length, isoform ambiguities, biases and artifacts. Direct RNA sequencing (DRS) technologies, commercialized by Oxford Nanopore Technologies (ONT), enable the direct interrogation of any given modification present in individual transcripts and promise to address the limitations of previous NGS-based methods. Here, we present the first ONT-based database of quantitative RNA modification profiles, DirectRMDB, which includes 16 types of modification and a total of 904,712 modification sites in 25 species identified from 39 independent studies. In addition to standard functions adopted by existing databases, such as gene annotations and post-transcriptional association analysis, we provide a fresh view of RNA modifications, which enables exploration of the epitranscriptome in an isoform-specific manner. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/.


Subject(s)
High-Throughput Nucleotide Sequencing , RNA Processing, Post-Transcriptional , Sequence Analysis, RNA , High-Throughput Nucleotide Sequencing/methods , Molecular Sequence Annotation , Protein Isoforms , RNA/genetics , Sequence Analysis, RNA/methods , Databases, Nucleic Acid
13.
Nucleic Acids Res ; 51(D1): D1388-D1396, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36062570

ABSTRACT

Recent advances in epitranscriptomics have unveiled functional associations between RNA modifications (RMs) and multiple human diseases, but distinguishing the functional or disease-related single nucleotide variants (SNVs) from the majority of 'silent' variants remains a major challenge. We previously developed the RMDisease database for unveiling the association between genetic variants and RMs concerning human disease pathogenesis. In this work, we present RMDisease v2.0, an updated database with expanded coverage. Using deep learning models and from 873 819 experimentally validated RM sites, we identified a total of 1 366 252 RM-associated variants that may affect (add or remove an RM site) 16 different types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 organisms (human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and SARS-CoV-2). Among them, 14 749 disease- and 2441 trait-associated genetic variants may function via the perturbation of epitranscriptomic markers. RMDisease v2.0 should serve as a useful resource for studying the genetic drivers of phenotypes that lie within the epitranscriptome layer circuitry, and is freely accessible at: www.rnamd.org/rmdisease2.


Subject(s)
Databases, Factual , RNA Processing, Post-Transcriptional , Animals , Humans , Phenotype , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , Epigenomics
14.
Cell Death Dis ; 13(12): 1059, 2022 12 20.
Article in English | MEDLINE | ID: mdl-36539410

ABSTRACT

Epigenetic factor Brd4 has emerged as a key regulator of cancer cell proliferation. Targeted inhibition of Brd4 suppresses growth and induces apoptosis of various cancer cells. In addition to apoptosis, Brd4 has also been shown to regulate several other forms of programmed cell death (PCD), including autophagy, necroptosis, pyroptosis, and ferroptosis, with different biological outcomes. PCD plays key roles in development and tissue homeostasis by eliminating unnecessary or detrimental cells. Dysregulation of PCD is associated with various human diseases, including cancer, neurodegenerative and infectious diseases. In this review, we discussed some recent findings on how Brd4 actively regulates different forms of PCD and the therapeutic potentials of targeting Brd4 in PCD-related human diseases. A better understanding of PCD regulation would provide not only new insights into pathophysiological functions of PCD but also provide new avenues for therapy by targeting Brd4-regulated PCD.


Subject(s)
Ferroptosis , Neoplasms , Humans , Nuclear Proteins/genetics , Nuclear Proteins/therapeutic use , Transcription Factors/therapeutic use , Apoptosis/physiology , Pyroptosis , Neoplasms/genetics , Neoplasms/drug therapy , Cell Cycle Proteins/genetics
15.
Article in English | MEDLINE | ID: mdl-36096444

ABSTRACT

As the most pervasive epigenetic marker present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation has been shown to participate in essential biological processes. Recent studies have revealed the distinct patterns of m6A methylome across human tissues, and a major challenge remains in elucidating the tissue-specific presence and circuitry of m6A methylation. We present here a comprehensive online platform m6A-TSHub for unveiling the context-specific m6A methylation and genetic mutations that potentially regulate m6A epigenetic mark. m6A-TSHub consists of four core components, including: (1) m6A-TSDB, a comprehensive database of 184,554 functionally annotated m6A sites derived from 23 human tissues and 499,369 m6A sites from 25 tumor conditions, respectively; (2) m6A-TSFinder, a web server for high-accuracy prediction of m6A methylation sites within a specific tissue from RNA sequences, which was constructed using multi-instance deep neural networks with gated attention; (3) m6A-TSVar, a web server for assessing the impact of genetic variants on tissue-specific m6A RNA modifications; and (4) m6A-CAVar, a database of 587,983 The Cancer Genome Atlas (TCGA) cancer mutations (derived from 27 cancer types) that were predicted to affect m6A modifications in the primary tissue of cancers. The database should make a useful resource for studying the m6A methylome and the genetic factors of epitranscriptome disturbance in a specific tissue (or cancer type). m6A-TSHub is accessible at www.xjtlu.edu.cn/biologicalsciences/m6ats.

16.
Nucleic Acids Res ; 50(18): 10290-10310, 2022 10 14.
Article in English | MEDLINE | ID: mdl-36155798

ABSTRACT

As the most pervasive epigenetic mark present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3'UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m6A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m6A sites, and improves m6A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m6A but also N1-methyladenosine (m1A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts.


Subject(s)
Deep Learning , RNA, Long Noncoding , 3' Untranslated Regions , Methylation , Protein Isoforms/genetics , RNA/genetics , RNA/metabolism , RNA, Messenger/genetics
17.
Methods ; 203: 62-69, 2022 07.
Article in English | MEDLINE | ID: mdl-35429629

ABSTRACT

Traditional epitranscriptome profiling approach relies on specific antibodies or chemical treatments to capture modified RNA molecules and then applies high throughput sequencing to identify their transcriptomic locations. However, due to the lack of suitable or high-quality antibodies, only a small proportion of the 170 documented RNA modifications were profiled with those approaches. Direct sequencing of native RNA molecules using Oxford Nanopore Technologies (ONT) enabled straight inspection of RNA modifications and offered a promising alternative solution. N6-methyladenosine (m6A) is known to cause characteristic changes and increased base call errors of ONT signals compared with non-modified adenosines, based on which, the m6A sites can be identified directly from ONT signals. Meanwhile, a number of studies have shown that it is possible to predict m6A sites from RNA primary sequences. Using the m6A sites revealed by Illumina technology as benchmark, we showed that, the accuracy of ONT-based m6A site prediction can be further increased by integrating additional information from the primary sequences of RNA (AUROC of 0.918), compared with using ONT signals only (AUROC 0.878 using Base call error features, and 0.804 using Tombo features), providing a new perspective for more reliable mining of the relatively noisy ONT signals.


Subject(s)
Nanopores , RNA , Adenosine/genetics , High-Throughput Nucleotide Sequencing , Methylation , RNA/genetics , Sequence Analysis, RNA
18.
Nucleic Acids Res ; 50(D1): D196-D203, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34986603

ABSTRACT

5-Methylcytosine (m5C) is one of the most prevalent covalent modifications on RNA. It is known to regulate a broad variety of RNA functions, including nuclear export, RNA stability and translation. Here, we present m5C-Atlas, a database for comprehensive collection and annotation of RNA 5-methylcytosine. The database contains 166 540 m5C sites in 13 species identified from 5 base-resolution epitranscriptome profiling technologies. Moreover, condition-specific methylation levels are quantified from 351 RNA bisulfite sequencing samples gathered from 22 different studies via an integrative pipeline. The database also presents several novel features, such as the evolutionary conservation of a m5C locus, its association with SNPs, and any relevance to RNA secondary structure. All m5C-atlas data are accessible through a user-friendly interface, in which the m5C epitranscriptomes can be freely explored, shared, and annotated with putative post-transcriptional mechanisms (e.g. RBP intermolecular interaction with RNA, microRNA interaction and splicing sites). Together, these resources offer unprecedented opportunities for exploring m5C epitranscriptomes. The m5C-Atlas database is freely accessible at https://www.xjtlu.edu.cn/biologicalsciences/m5c-atlas.


Subject(s)
Databases, Genetic , Epigenome/genetics , Software , Transcriptome/genetics , 5-Methylcytosine/chemistry , 5-Methylcytosine/metabolism , Humans , MicroRNAs/genetics , Polymorphism, Single Nucleotide/genetics , RNA Processing, Post-Transcriptional/genetics , Sequence Analysis, RNA
19.
Methods ; 203: 328-334, 2022 07.
Article in English | MEDLINE | ID: mdl-33540081

ABSTRACT

N6,2'-O-dimethyladenosine (m6Am) is a reversible modification widely occurred on varied RNA molecules. The biological function of m6Am is yet to be known though recent studies have revealed its influences in cellular mRNA fate. Precise identification of m6Am sites on RNA is vital for the understanding of its biological functions. We present here m6AmPred, the first web server for in silico identification of m6Am sites from the primary sequences of RNA. Built upon the eXtreme Gradient Boosting with Dart algorithm (XgbDart) and EIIP-PseEIIP encoding scheme, m6AmPred achieved promising prediction performance with the AUCs greater than 0.954 when tested by 10-fold cross-validation and independent testing datasets. To critically test and validate the performance of m6AmPred, the experimentally verified m6Am sites from two data sources were cross-validated. The m6AmPred web server is freely accessible at: https://www.xjtlu.edu.cn/biologicalsciences/m6am, and it should make a useful tool for the researchers who are interested in N6,2'-O-dimethyladenosine RNA modification.


Subject(s)
Adenosine , RNA , Adenosine/genetics , RNA/genetics , RNA, Messenger/genetics
20.
Methods ; 203: 378-382, 2022 07.
Article in English | MEDLINE | ID: mdl-34245870

ABSTRACT

The primary sequences of DNA, RNA and protein have been used as the dominant information source of existing machine learning tools, especially for contexts not fully explored by wet-experimental approaches. Since molecular markers are profoundly orchestrated in the living organisms, those markers that cannot be unambiguously recovered from the primary sequence often help to predict other biological events. To the best of our knowledge, there is no current tool to build and deploy machine learning models that consider genomic evidence. We therefore developed the WHISTLE server, the first machine learning platform based on genomic coordinates. It features convenient covariate extraction and model web deployment with 46 distinct genomic features integrated along with the conventional sequence features. We showed that, when predicting m6A sites from SRAMP project, the model integrating genomic features substantially outperformed those based on only sequence features. The WHISTLE server should be a useful tool for studying biological attributes specifically associated with genomic coordinates, and is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/whi2.


Subject(s)
Machine Learning , RNA , Computational Biology , Genomics , RNA/genetics , RNA/metabolism , Sequence Analysis, RNA
SELECTION OF CITATIONS
SEARCH DETAIL