Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 131
Filter
1.
Nucleic Acids Res ; 50(D1): D413-D420, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34570220

ABSTRACT

LncRNAs are not only well-known as non-coding elements, but also serve as templates for peptide translation, playing important roles in fundamental cellular processes and diseases. Here, we describe a database, TransLnc (http://bio-bigdata.hrbmu.edu.cn/TransLnc/), which aims to provide comprehensive experimentally supported and predicted lncRNA peptides in multiple species. TransLnc currently documents approximate 583 840 peptides encoded by 33 094 lncRNAs. Six types of direct and indirect evidences supporting the coding potential of lncRNAs were integrated, and 65.28% peptides entries were with at least one type of evidence. Considering the strong tissue-specific expression of lncRNAs, TransLnc allows users to access lncRNA peptides in any of the 34 tissues involved in. In addition, both the unique characteristic and homology relationship were also predicted and provided. Importantly, TransLnc provides computationally predicted tumour neoantigens from peptides encoded by lncRNAs, which would provide novel insights into cancer immunotherapy. There were 220 791 and 237 915 candidate neoantigens binding by major histocompatibility complex (MHC) class I or II molecules, respectively. Several flexible tools were developed to aid retrieve and analyse, particularly lncRNAs tissue expression patterns, clinical relevance across cancer types. TransLnc will serve as a valuable resource for investigating the translation capacity of lncRNAs and greatly extends the cancer immunopeptidome.


Subject(s)
Databases, Genetic , Neoplasms/genetics , Peptides/genetics , Protein Biosynthesis , RNA, Long Noncoding/genetics , Software , Animals , Antigens, Neoplasm/genetics , Antigens, Neoplasm/immunology , Binding Sites , Gene Expression Regulation, Neoplastic , Histocompatibility Antigens Class I/genetics , Histocompatibility Antigens Class I/immunology , Histocompatibility Antigens Class II/genetics , Histocompatibility Antigens Class II/immunology , Humans , Immunotherapy/methods , Internet , Mice , Molecular Sequence Annotation , Neoplasm Proteins/classification , Neoplasm Proteins/genetics , Neoplasm Proteins/immunology , Neoplasms/immunology , Neoplasms/pathology , Neoplasms/therapy , Organ Specificity , Peptides/classification , Peptides/immunology , Protein Binding , RNA, Long Noncoding/classification , RNA, Long Noncoding/immunology , Rats
2.
Nucleic Acids Res ; 50(D1): D211-D221, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34570238

ABSTRACT

Small non-coding RNAs (sncRNAs) are pervasive regulators of physiological and pathological processes. We previously developed the human miRNA Tissue Atlas, detailing the expression of miRNAs across organs in the human body. Here, we present an updated resource containing sequencing data of 188 tissue samples comprising 21 organ types retrieved from six humans. Sampling the organs from the same bodies minimizes intra-individual variability and facilitates the making of a precise high-resolution body map of the non-coding transcriptome. The data allow shedding light on the organ- and organ system-specificity of piwi-interacting RNAs (piRNAs), transfer RNAs (tRNAs), microRNAs (miRNAs) and other non-coding RNAs. As use case of our resource, we describe the identification of highly specific ncRNAs in different organs. The update also contains 58 samples from six tissues of the Tabula Muris collection, allowing to check if the tissue specificity is evolutionary conserved between Homo sapiens and Mus musculus. The updated resource of 87 252 non-coding RNAs from nine non-coding RNA classes for all organs and organ systems is available online without any restrictions (https://www.ccb.uni-saarland.de/tissueatlas2).


Subject(s)
MicroRNAs/genetics , RNA, Long Noncoding/genetics , RNA, Small Interfering/genetics , RNA, Small Nuclear/genetics , RNA, Small Nucleolar/genetics , RNA, Transfer/genetics , Software , Animals , Atlases as Topic , Female , Humans , Internet , Male , Mice , MicroRNAs/classification , MicroRNAs/metabolism , Organ Specificity , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , RNA, Small Interfering/classification , RNA, Small Interfering/metabolism , RNA, Small Nuclear/classification , RNA, Small Nuclear/metabolism , RNA, Small Nucleolar/classification , RNA, Small Nucleolar/metabolism , RNA, Transfer/classification , RNA, Transfer/metabolism , Transcriptome
3.
Nucleic Acids Res ; 50(D1): D190-D195, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34751395

ABSTRACT

LncRNAWiki, a knowledgebase of human long non-coding RNAs (lncRNAs), has been rapidly expanded by incorporating more experimentally validated lncRNAs. Since it was built based on MediaWiki as its database system, it fails to manage data in a structured way and is ineffective to support systematic exploration of lncRNAs. Here we present LncRNAWiki 2.0 (https://ngdc.cncb.ac.cn/lncrnawiki), which is significantly improved with enhanced database system and curation model. In LncRNAWiki 2.0, all contents are organized in a structured manner powered by MySQL/Java and curators are able to submit/edit annotations based on the curation model that includes a wider range of annotation items. Moreover, it is equipped with popular online tools to help users identify lncRNAs with potentially important functions, and provides more user-friendly web interfaces to facilitate data curation, retrieval and visualization. Consequently, LncRNAWiki 2.0 incorporates a total of 2512 lncRNAs and 106 242 associations for disease, function, drug, interacting partner, molecular signature, experimental sample, CRISPR design, etc., thus providing a comprehensive and up-to-date resource of functionally annotated lncRNAs in human.


Subject(s)
Databases, Genetic , Knowledge Bases , RNA, Long Noncoding/genetics , Software , Humans , Internet , Molecular Sequence Annotation , RNA, Long Noncoding/classification
4.
Nucleic Acids Res ; 50(D1): D1442-D1447, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34723326

ABSTRACT

The Green Non-Coding Database (GreeNC) is one of the reference databases for the study of plant long non-coding RNAs (lncRNAs). Here we present our most recent update where 16 species have been updated, while 78 species have been added, resulting in the annotation of more than 495 000 lncRNAs. Moreover, sequence clustering was applied providing information about sequence conservation and gene families. The current version of the database is available at: http://greenc.sequentiabiotech.com/wiki2/Main_Page.


Subject(s)
Databases, Nucleic Acid , Genome, Plant/genetics , Plants/classification , RNA, Long Noncoding/classification , Conserved Sequence/genetics , Humans , Molecular Sequence Annotation , Plants/genetics , RNA, Long Noncoding/genetics , RNA, Plant/classification , RNA, Plant/genetics
5.
Nucleic Acids Res ; 50(D1): D1295-D1306, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34791419

ABSTRACT

The long non-coding RNAs associating with other molecules can coordinate several physiological processes and their dysfunction can impact diverse human diseases. To date, systematic and intensive annotations on diverse interaction regulations of lncRNAs in human cancer were not available. Here, we built lncRNAfunc, a knowledgebase of lncRNA function in human cancer at https://ccsm.uth.edu/lncRNAfunc, aiming to provide a resource and reference for providing therapeutically targetable lncRNAs and intensive interaction regulations. To do this, we collected 15 900 lncRNAs across 33 cancer types from TCGA. For individual lncRNAs, we performed multiple interaction analyses of different biomolecules including DNA, RNA, and protein levels. Our intensive studies of lncRNAs provide diverse potential mechanisms of lncRNAs that regulate gene expression through binding enhancers and 3'-UTRs of genes, competing for miRNA binding sites with mRNAs, recruiting the transcription factors to gene promoters. Furthermore, we investigated lncRNAs that potentially affect the alternative splicing events through interacting with RNA binding Proteins. We also performed multiple functional annotations including cancer stage-associated lncRNAs, RNA A-to-I editing event-associated lncRNAs, and lncRNA expression quantitative trait loci. lncRNAfunc is a unique resource for cancer research communities to help better understand potential lncRNA regulations and therapeutic lncRNA targets.


Subject(s)
Databases, Genetic , Knowledge Bases , Neoplasms/genetics , RNA, Long Noncoding/genetics , Alternative Splicing/genetics , Humans , Neoplasms/classification , RNA, Long Noncoding/classification , RNA, Messenger/genetics
6.
Nucleic Acids Res ; 50(D1): D118-D128, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34918744

ABSTRACT

Extracellular vesicles (EVs) are small membranous vesicles that contain an abundant cargo of different RNA species with specialized functions and clinical implications. Here, we introduce an updated online database (http://www.exoRBase.org), exoRBase 2.0, which is a repository of EV long RNAs (termed exLRs) derived from RNA-seq data analyses of diverse human body fluids. In exoRBase 2.0, the number of exLRs has increased to 19 643 messenger RNAs (mRNAs), 15 645 long non-coding RNAs (lncRNAs) and 79 084 circular RNAs (circRNAs) obtained from ∼1000 human blood, urine, cerebrospinal fluid (CSF) and bile samples. Importantly, exoRBase 2.0 not only integrates and compares exLR expression profiles but also visualizes the pathway-level functional changes and the heterogeneity of origins of circulating EVs in the context of different physiological and pathological conditions. Our database provides an attractive platform for the identification of novel exLR signatures from human biofluids that will aid in the discovery of new circulating biomarkers to improve disease diagnosis and therapy.


Subject(s)
Databases, Genetic , RNA, Circular/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Body Fluids/chemistry , Extracellular Vesicles/classification , Extracellular Vesicles/genetics , Humans , RNA, Circular/classification , RNA, Long Noncoding/chemistry , RNA, Long Noncoding/classification , RNA, Messenger/chemistry , RNA, Messenger/classification , RNA-Seq
7.
Infect Genet Evol ; 97: 105195, 2022 01.
Article in English | MEDLINE | ID: mdl-34954105

ABSTRACT

SARS-CoV-2 is the RNA virus responsible for COVID-19, the prognosis of which has been found to be slightly worse in men. The present study aimed to analyze the expression of different mRNAs and their regulatory molecules (miRNAs and lncRNAs) to consider the potential existence of sex-specific expression patterns and COVID-19 susceptibility using bioinformatics analysis. The binding sites of all human mature miRNA sequences on the SARS-CoV-2 genome nucleotide sequence were predicted by the miRanda tool. Sequencing data was excavated using the Galaxy web server from GSE157103, and the output of feature counts was analyzed using DEseq2 packages to obtain differentially expressed genes (DEGs). Gene set enrichment analysis (GSEA) and DEG annotation analyses were performed using the ToppGene and Metascape tools. Using the RNA Interactome Database, we predicted interactions between differentially expressed lncRNAs and differentially expressed mRNAs. Finally, their networks were constructed with top miRNAs. We identified 11 miRNAs with three to five binding sites on the SARS-COVID-2 genome reference. MiR-29c-3p, miR-21-3p, and miR-6838-5p occupied four binding sites, and miR-29a-3p had five binding sites on the SARS-CoV-2 genome. Moreover, miR-29a-3p, and miR-29c-3p were the top miRNAs targeting DEGs. The expression levels of miRNAs (125, 181b, 130a, 29a, b, c, 212, 181a, 133a) changed in males with COVID-19, in whom they regulated ACE2 expression and affected the immune response by affecting phagosomes, complement activation, and cell-matrix adhesion. Our results indicated that XIST lncRNA was up-regulated, and TTTY14, TTTY10, and ZFY-AS1 lncRN as were down-regulated in both ICU and non-ICU men with COVID-19. Dysregulation of noncoding-RNAs has critical effects on the pathophysiology of men with COVID-19, which is why they may be used as biomarkers and therapeutic agents. Overall, our results indicated that the miR-29 family target regulation patterns and might become promising biomarkers for severity and survival outcome in men with COVID-19.


Subject(s)
Angiotensin-Converting Enzyme 2/genetics , COVID-19/genetics , MicroRNAs/genetics , RNA, Long Noncoding/genetics , SARS-CoV-2/genetics , Angiotensin-Converting Enzyme 2/metabolism , COVID-19/epidemiology , COVID-19/pathology , COVID-19/virology , Computational Biology/methods , Coronavirus Envelope Proteins/genetics , Coronavirus Envelope Proteins/metabolism , Coronavirus M Proteins/genetics , Coronavirus M Proteins/metabolism , Coronavirus Nucleocapsid Proteins/genetics , Coronavirus Nucleocapsid Proteins/metabolism , Databases, Genetic , Female , Gene Expression Regulation , Host-Pathogen Interactions/genetics , Humans , Male , MicroRNAs/classification , MicroRNAs/metabolism , Phosphoproteins/genetics , Phosphoproteins/metabolism , Protein Binding , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , SARS-CoV-2/classification , SARS-CoV-2/pathogenicity , Severity of Illness Index , Sex Factors , Signal Transduction , Spike Glycoprotein, Coronavirus/genetics , Spike Glycoprotein, Coronavirus/metabolism
8.
Front Immunol ; 12: 763323, 2021.
Article in English | MEDLINE | ID: mdl-34868009

ABSTRACT

Long non-coding RNAs (lncRNAs) have been recently reported to be involved in the pathoetiology of Parkinson's disease (PD). Circulatory levels of lncRNAs might be used as markers for PD. In the present work, we measured expression levels of HULC, PVT1, MEG3, SPRY4-IT1, LINC-ROR and DSCAM-AS1 lncRNAs in the circulation of patients with PD versus healthy controls. Expression of HULC was lower in total patients compared with total controls (Expression ratio (ER)=0.19, adjusted P value<0.0001) as well as in female patients compared with female controls (ER=0.071, adjusted P value=0.0004). Expression of PVT1 was lower in total patients compared with total controls (ER=0.55, adjusted P value=0.0124). Expression of DSCAM-AS1 was higher in total patients compared with total controls (ER=5.67, P value=0.0029) and in male patients compared with male controls (ER=9.526, adjusted P value=0.0024). Expression of SPRY4-IT was higher in total patients compared with total controls (ER=2.64, adjusted P value<0.02) and in male patients compared with male controls (ER=3.43, P value<0.03). Expression of LINC-ROR was higher in total patients compared with total controls (ER=10.36, adjusted P value<0.0001) and in both male and female patients compared with sex-matched controls (ER=4.57, adjusted P value=0.03 and ER=23.47, adjusted P value=0.0019, respectively). Finally, expression of MEG3 was higher in total patients compared with total controls (ER=13.94, adjusted P value<0.0001) and in both male and female patients compared with sex-matched controls (ER=8.60, adjusted P value<0.004 and ER=22.58, adjusted P value<0.0085, respectively). ROC curve analysis revealed that MEG3 and LINC-ROR have diagnostic power of 0.77 and 0.73, respectively. Other lncRNAs had AUC values less than 0.7. Expression of none of lncRNAs was correlated with age of patients, disease duration, disease stage, MMSE or UPDRS. The current study provides further evidence for dysregulation of lncRNAs in the circulation of PD patients.


Subject(s)
Biomarkers, Tumor/genetics , Gene Expression Regulation, Neoplastic , Parkinson Disease/genetics , RNA, Long Noncoding/genetics , Transcriptome/genetics , Adult , Aged , Aged, 80 and over , Biomarkers, Tumor/blood , Cluster Analysis , Female , Humans , Male , Middle Aged , Parkinson Disease/blood , Parkinson Disease/diagnosis , RNA, Long Noncoding/blood , RNA, Long Noncoding/classification , ROC Curve
9.
Genes (Basel) ; 12(12)2021 12 19.
Article in English | MEDLINE | ID: mdl-34946967

ABSTRACT

Circular RNA (circRNA) is a distinguishable circular formed long non-coding RNA (lncRNA), which has specific roles in transcriptional regulation, multiple biological processes. The identification of circRNA from other lncRNA is necessary for relevant research. In this study, we designed attention-based multi-instance learning (MIL) network architecture fed with a raw sequence, to learn the sparse features of RNA sequences and to accomplish the circRNAs identification task. The model outperformed the state-of-art models. Moreover, following the validation of the attention mechanism effectiveness by the handwritten digit dataset, the key sequence loci underlying circRNA's recognition were obtained based on the corresponding attention score. Then, motif enrichment analysis identified some of the key motifs for circRNA formation. In conclusion, we designed deep learning network architecture suitable for learning gene sequences with sparse features and implemented it for the circRNA identification task, and the model has strong representation capability in the indication of some key loci.


Subject(s)
Computational Biology/methods , RNA, Circular/classification , RNA, Long Noncoding/classification , Databases, Genetic , Deep Learning , Gene Expression Regulation
10.
Int J Mol Sci ; 22(22)2021 Nov 16.
Article in English | MEDLINE | ID: mdl-34830241

ABSTRACT

Breast cancer (BC) is the most frequent malignancy identified in adult females, resulting in enormous financial losses worldwide. Owing to the heterogeneity as well as various molecular subtypes, the molecular pathways underlying carcinogenesis in various forms of BC are distinct. Therefore, the advancement of alternative therapy is required to combat the ailment. Recent analyses propose that long non-coding RNAs (lncRNAs) perform an essential function in controlling immune response, and therefore, may provide essential information about the disorder. However, their function in patients with triple-negative BC (TNBC) has not been explored in detail. Here, we analyzed the changes in the genomic expression of messenger RNA (mRNA) and lncRNA in standard control in response to cancer metastasis using publicly available single-cell RNA-Seq data. We identified a total of 197 potentially novel lncRNAs in TNBC patients of which 86 were differentially upregulated and 111 were differentially downregulated. In addition, among the 909 candidate lncRNA transcripts, 19 were significantly differentially expressed (DE) of which three were upregulated and 16 were downregulated. On the other hand, 1901 mRNA transcripts were significantly DE of which 1110 were upregulated and 791 were downregulated by TNBCs subtypes. The Gene Ontology (GO) analyses showed that some of the host genes were enriched in various biological, molecular, and cellular functions. The Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis showed that some of the genes were involved in only one pathway of prostate cancer. The lncRNA-miRNA-gene network analysis showed that the lncRNAs TCONS_00076394 and TCONS_00051377 interacted with breast cancer-related micro RNAs (miRNAs) and the host genes of these lncRNAs were also functionally related to breast cancer. Thus, this study provides novel lncRNAs as potential biomarkers for the therapeutic intervention of this cancer subtype.


Subject(s)
MicroRNAs/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , RNA, Neoplasm/genetics , Triple Negative Breast Neoplasms/genetics , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Computational Biology/methods , Female , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Gene Ontology , Gene Regulatory Networks , Humans , Mammary Glands, Human/metabolism , Mammary Glands, Human/pathology , MicroRNAs/classification , MicroRNAs/metabolism , Molecular Sequence Annotation , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , RNA, Messenger/classification , RNA, Messenger/metabolism , RNA, Neoplasm/classification , RNA, Neoplasm/metabolism , Triple Negative Breast Neoplasms/diagnosis , Triple Negative Breast Neoplasms/metabolism , Triple Negative Breast Neoplasms/pathology
11.
PLoS One ; 16(10): e0258194, 2021.
Article in English | MEDLINE | ID: mdl-34597331

ABSTRACT

To identify long non-coding RNAs (lncRNAs) and their potential roles in hepatic fibrosis in rat liver issues induced by CCl4, lncRNAs and genes were analyzed in fibrotic rat liver tissues by RNA sequencing and verified by quantitative reverse transcription polymerase chain reaction (qRT-PCR). Differentially expressed (DE) lncRNAs (DE-lncRNAs) and genes were subjected to bioinformatics analysis and used to construct a co-expression network. We identified 10 novel DE-lncRNAs that were downregulated during the hepatic fibrosis process. The cis target gene of DE-lncRNA, XLOC118358, was Met, and the cis target gene of the other nine DE-lncRNAs, XLOC004600, XLOC004605, XLOC004610, XLOC004611, XLOC004568, XLOC004580 XLOC004598, XLOC004601, and XLOC004602 was Nox4. The results of construction of a pathway-DEG co-expression network show that lncRNA-Met and lncRNAs-Nox4 were involved in oxidation-reduction processes and PI3K/Akt signaling pathway. Our results identified 10 DE-lncRNAs related to hepatic fibrosis, and the potential roles of DE-lncRNAs and target genes in hepatic fibrosis might provide new therapeutic strategies for hepatic fibrosis.


Subject(s)
Genetic Diseases, Inborn/genetics , Liver Cirrhosis/genetics , Liver/metabolism , RNA, Long Noncoding/genetics , Transcriptome/genetics , Animals , Carbon Tetrachloride/toxicity , Gene Regulatory Networks/genetics , Genetic Diseases, Inborn/chemically induced , Genetic Diseases, Inborn/pathology , High-Throughput Nucleotide Sequencing , Humans , Liver Cirrhosis/chemically induced , Liver Cirrhosis/pathology , RNA, Long Noncoding/classification , RNA, Long Noncoding/isolation & purification , Rats , Sequence Analysis, RNA , Signal Transduction/genetics
12.
Biomolecules ; 11(8)2021 07 31.
Article in English | MEDLINE | ID: mdl-34439798

ABSTRACT

Neurodegenerative diseases (NDs) are characterized by progressive neuronal dysfunction and death of brain cells population. As the early manifestations of NDs are similar, their symptoms are difficult to distinguish, making the timely detection and discrimination of each neurodegenerative disorder a priority. Several investigations have revealed the importance of microRNAs and long non-coding RNAs in neurodevelopment, brain function, maturation, and neuronal activity, as well as its dysregulation involved in many types of neurological diseases. Therefore, the expression pattern of these molecules in the different NDs have gained significant attention to improve the diagnostic and treatment at earlier stages. In this sense, we gather the different microRNAs and long non-coding RNAs that have been reported as dysregulated in each disorder. Since there are a vast number of non-coding RNAs altered in NDs, some sort of synthesis, filtering and organization method should be applied to extract the most relevant information. Hence, machine learning is considered as an important tool for this purpose since it can classify expression profiles of non-coding RNAs between healthy and sick people. Therefore, we deepen in this branch of computer science, its different methods, and its meaningful application in the diagnosis of NDs from the dysregulated non-coding RNAs. In addition, we demonstrate the relevance of machine learning in NDs from the description of different investigations that showed an accuracy between 85% to 95% in the detection of the disease with this tool. All of these denote that artificial intelligence could be an excellent alternative to help the clinical diagnosis and facilitate the identification diseases in early stages based on non-coding RNAs.


Subject(s)
Alzheimer Disease/genetics , Amyotrophic Lateral Sclerosis/genetics , Machine Learning , MicroRNAs/genetics , Parkinson Disease/genetics , RNA, Long Noncoding/genetics , Alzheimer Disease/metabolism , Alzheimer Disease/pathology , Amyotrophic Lateral Sclerosis/metabolism , Amyotrophic Lateral Sclerosis/pathology , Computational Biology/methods , Databases, Genetic , Gene Expression Regulation , Humans , Information Dissemination , Internet , MicroRNAs/classification , MicroRNAs/metabolism , Nerve Tissue Proteins/genetics , Nerve Tissue Proteins/metabolism , Neurons/metabolism , Neurons/pathology , Parkinson Disease/metabolism , Parkinson Disease/pathology , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , Signal Transduction , Software
13.
Cells ; 10(7)2021 07 02.
Article in English | MEDLINE | ID: mdl-34359842

ABSTRACT

Noncoding RNAs, including microRNAs (miRNAs), small interference RNAs (siRNAs), circular RNA (circRNA), and long noncoding RNAs (lncRNAs), control gene expression at the transcription, post-transcription, and translation levels. Apart from protein-coding genes, accumulating evidence supports ncRNAs playing a critical role in shaping plant growth and development and biotic and abiotic stress responses in various species, including legume crops. Noncoding RNAs (ncRNAs) interact with DNA, RNA, and proteins, modulating their target genes. However, the regulatory mechanisms controlling these cellular processes are not well understood. Here, we discuss the features of various ncRNAs, including their emerging role in contributing to biotic/abiotic stress response and plant growth and development, in addition to the molecular mechanisms involved, focusing on legume crops. Unravelling the underlying molecular mechanisms and functional implications of ncRNAs will enhance our understanding of the coordinated regulation of plant defences against various biotic and abiotic stresses and for key growth and development processes to better design various legume crops for global food security.


Subject(s)
Fabaceae/genetics , Gene Expression Regulation, Plant , MicroRNAs/genetics , RNA, Circular/genetics , RNA, Long Noncoding/genetics , RNA, Plant/genetics , RNA, Small Interfering/genetics , Fabaceae/growth & development , Fabaceae/metabolism , Food Security , Gene Expression Regulation, Developmental , Humans , MicroRNAs/classification , MicroRNAs/metabolism , Organ Specificity , Protein Biosynthesis , RNA, Circular/classification , RNA, Circular/metabolism , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , RNA, Plant/classification , RNA, Plant/metabolism , RNA, Small Interfering/classification , RNA, Small Interfering/metabolism , Species Specificity , Stress, Physiological/genetics , Transcription, Genetic
14.
Sci Rep ; 11(1): 16794, 2021 08 18.
Article in English | MEDLINE | ID: mdl-34408216

ABSTRACT

Lung adenocarcinoma (LUAD) is the most common subtype of lung cancer, but the prognosis of LUAD patients remains unsatisfactory. Here, we retrieved the RNA-seq data of LUAD cohort from The Cancer Genome Atlas (TCGA) database and then identified differentially expressed immune-related lncRNAs (DEirlncRNAs) between LUAD and normal controls. Based on a new method of cyclically single pairing along with a 0-or-1 matrix, we constructed a novel prognostic signature of 8 DEirlncRNA pairs in LUAD with no dependence upon specific expression levels of lncRNAs. This prognostic model exhibited significant power in distinguishing good or poor prognosis of LUAD patients and the values of the area under the curve (AUC) were all over 0.70 in 1, 3, 5 years receiver operating characteristic (ROC) curves. Moreover, the risk score of the model could serve as an independent prognostic factor for patients with LUAD. In addition, the risk model was significantly associated with clinicopathological characteristics, tumor-infiltrating immune cells, immune-related molecules and sensitivity of anti-tumor drugs. This novel signature of DEirlncRNA pairs in LUAD, which did not require specific expression levels of lncRNAs, might be used to guide the administration of patients with LUAD in clinical practice.


Subject(s)
Adenocarcinoma of Lung/genetics , Biomarkers, Tumor/genetics , RNA, Long Noncoding/genetics , Transcriptome/genetics , Aged , Female , Gene Expression Profiling , Gene Expression Regulation, Neoplastic/genetics , Humans , Kaplan-Meier Estimate , Male , Middle Aged , Prognosis , RNA, Long Noncoding/classification , RNA-Seq
15.
RNA ; 27(9): 1082-1101, 2021 09.
Article in English | MEDLINE | ID: mdl-34193551

ABSTRACT

The expression of long noncoding RNAs is highly enriched in the human nervous system. However, the function of neuronal lncRNAs in the cytoplasm and their potential translation remains poorly understood. Here we performed Poly-Ribo-Seq to understand the interaction of lncRNAs with the translation machinery and the functional consequences during neuronal differentiation of human SH-SY5Y cells. We discovered 237 cytoplasmic lncRNAs up-regulated during early neuronal differentiation, 58%-70% of which are associated with polysome translation complexes. Among these polysome-associated lncRNAs, we find 45 small ORFs to be actively translated, 17 specifically upon differentiation. Fifteen of 45 of the translated lncRNA-smORFs exhibit sequence conservation within Hominidea, suggesting they are under strong selective constraint in this clade. The profiling of publicly available data sets revealed that 8/45 of the translated lncRNAs are dynamically expressed during human brain development, and 22/45 are associated with cancers of the central nervous system. One translated lncRNA we discovered is LINC01116, which is induced upon differentiation and contains an 87 codon smORF exhibiting increased ribosome profiling signal upon differentiation. The resulting LINC01116 peptide localizes to neurites. Knockdown of LINC01116 results in a significant reduction of neurite length in differentiated cells, indicating it contributes to neuronal differentiation. Our findings indicate cytoplasmic lncRNAs interact with translation complexes, are a noncanonical source of novel peptides, and contribute to neuronal function and disease. Specifically, we demonstrate a novel functional role for LINC01116 during human neuronal differentiation.


Subject(s)
Cell Differentiation/genetics , Neurons/metabolism , Polyribosomes/genetics , Protein Biosynthesis , RNA, Long Noncoding/genetics , Base Sequence , Brain/growth & development , Brain/metabolism , Brain/pathology , Brain Neoplasms/genetics , Brain Neoplasms/metabolism , Brain Neoplasms/pathology , Cell Differentiation/drug effects , Cell Line, Tumor , Cytoplasm/genetics , Cytoplasm/metabolism , Humans , Neurons/cytology , Open Reading Frames , Polyribosomes/metabolism , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , Sequence Analysis, RNA , Tretinoin/pharmacology
16.
Funct Integr Genomics ; 21(2): 195-204, 2021 Mar.
Article in English | MEDLINE | ID: mdl-33635499

ABSTRACT

Following the elucidation of the critical roles they play in numerous important biological processes, long noncoding RNAs (lncRNAs) have gained vast attention in recent years. Manual annotation of lncRNAs is restricted by known gene annotations and is prone to false prediction due to the incompleteness of available data. However, with the advent of high-throughput sequencing technologies, a magnitude of high-quality data has become available for annotation, especially for plant species such as wheat. Here, we compared prediction accuracies of several machine learning algorithms using a 10-fold cross-validation. This study includes a comprehensive feature selection step to refine irrelevant and repeated features. We present a crop-specific, alignment-free coding potential prediction tool, LncMachine, that performs at higher prediction accuracies than the currently available popular tools (CPC2, CPAT, and CNIT) when used with the Random Forest algorithm. Further, LncMachine with Random Forest performed well on human and mouse data, with an average accuracy of 92.67%. LncMachine only requires either a FASTA file or a TAB separated CSV file containing features as input files. LncMachine can deploy several user-provided algorithms in real time and therefore be effortlessly applied to a wide range of studies.


Subject(s)
Computational Biology , Molecular Sequence Annotation , Plants/genetics , RNA, Long Noncoding/genetics , Algorithms , High-Throughput Nucleotide Sequencing , Machine Learning , RNA, Long Noncoding/classification
17.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33585910

ABSTRACT

As consequence of the various genomic sequencing projects, an increasing volume of biological sequence data is being produced. Although machine learning algorithms have been successfully applied to a large number of genomic sequence-related problems, the results are largely affected by the type and number of features extracted. This effect has motivated new algorithms and pipeline proposals, mainly involving feature extraction problems, in which extracting significant discriminatory information from a biological set is challenging. Considering this, our work proposes a new study of feature extraction approaches based on mathematical features (numerical mapping with Fourier, entropy and complex networks). As a case study, we analyze long non-coding RNA sequences. Moreover, we separated this work into three studies. First, we assessed our proposal with the most addressed problem in our review, e.g. lncRNA and mRNA; second, we also validate the mathematical features in different classification problems, to predict the class of lncRNA, e.g. circular RNAs sequences; third, we analyze its robustness in scenarios with imbalanced data. The experimental results demonstrated three main contributions: first, an in-depth study of several mathematical features; second, a new feature extraction pipeline; and third, its high performance and robustness for distinct RNA sequence classification. Availability:https://github.com/Bonidia/FeatureExtraction_BiologicalSequences.


Subject(s)
Computational Biology/methods , Deep Learning , Models, Theoretical , RNA, Circular/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Base Sequence/genetics , Entropy , Fourier Analysis , Humans , Open Reading Frames , RNA, Circular/classification , RNA, Long Noncoding/classification , RNA, Messenger/classification
18.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33415333

ABSTRACT

Predicting disease-related long non-coding RNAs (lncRNAs) is beneficial to finding of new biomarkers for prevention, diagnosis and treatment of complex human diseases. In this paper, we proposed a machine learning techniques-based classification approach to identify disease-related lncRNAs by graph auto-encoder (GAE) and random forest (RF) (GAERF). First, we combined the relationship of lncRNA, miRNA and disease into a heterogeneous network. Then, low-dimensional representation vectors of nodes were learned from the network by GAE, which reduce the dimension and heterogeneity of biological data. Taking these feature vectors as input, we trained a RF classifier to predict new lncRNA-disease associations (LDAs). Related experiment results show that the proposed method for the representation of lncRNA-disease characterizes them accurately. GAERF achieves superior performance owing to the ensemble learning method, outperforming other methods significantly. Moreover, case studies further demonstrated that GAERF is an effective method to predict LDAs.


Subject(s)
Lung Neoplasms/genetics , Machine Learning , Neural Networks, Computer , Prostatic Neoplasms/genetics , RNA, Long Noncoding/genetics , Stomach Neoplasms/genetics , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Computational Biology/methods , Computer Graphics/statistics & numerical data , Decision Trees , Gene Expression Regulation, Neoplastic , Humans , Lung Neoplasms/diagnosis , Lung Neoplasms/metabolism , Lung Neoplasms/pathology , Male , MicroRNAs/classification , MicroRNAs/genetics , MicroRNAs/metabolism , Prostatic Neoplasms/diagnosis , Prostatic Neoplasms/metabolism , Prostatic Neoplasms/pathology , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , ROC Curve , Risk Factors , Stomach Neoplasms/diagnosis , Stomach Neoplasms/metabolism , Stomach Neoplasms/pathology
19.
Methods Mol Biol ; 2254: 41-60, 2021.
Article in English | MEDLINE | ID: mdl-33326069

ABSTRACT

K-mer based comparisons have emerged as powerful complements to BLAST-like alignment algorithms, particularly when the sequences being compared lack direct evolutionary relationships. In this chapter, we describe methods to compare k-mer content between groups of long noncoding RNAs (lncRNAs), to identify communities of lncRNAs with related k-mer contents, to identify the enrichment of protein-binding motifs in lncRNAs, and to scan for domains of related k-mer contents in lncRNAs. Our step-by-step instructions are complemented by Python code deposited in Github. Though our chapter focuses on lncRNAs, the methods we describe could be applied to any set of nucleic acid sequences.


Subject(s)
Computational Biology/methods , RNA, Long Noncoding/classification , RNA, Long Noncoding/genetics , Algorithms , Cluster Analysis , Nucleotide Motifs/genetics , Protein Binding
20.
Nucleic Acids Res ; 49(D1): D1244-D1250, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33219661

ABSTRACT

We describe an updated comprehensive database, LincSNP 3.0 (http://bioinfo.hrbmu.edu.cn/LincSNP), which aims to document and annotate disease or phenotype-associated variants in human long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) or their regulatory elements. LincSNP 3.0 has updated with several novel features, including (i) more types of variants including single nucleotide polymorphisms (SNPs), linkage disequilibrium SNPs (LD SNPs), somatic mutations and RNA editing sites have been expanded; (ii) more regulatory elements including transcription factor binding sites (TFBSs), enhancers, DNase I hypersensitive sites (DHSs), topologically associated domains (TADs), footprintss, methylations and open chromatin regions have been added; (iii) the associations among circRNAs, regulatory elements and variants have been identified; (iv) more experimentally supported variant-lncRNA/circRNA-disease/phenotype associations have been manually collected; (v) the sources of lncRNAs, circRNAs, SNPs, somatic mutations and RNA editing sites have been updated. Moreover, four flexible online tools including Genome Browser, Variant Mapper, Circos Plotter and Functional Annotation have been developed to retrieve, visualize and analyze the data. Collectively, LincSNP 3.0 provides associations among functional variants, regulatory elements, lncRNAs and circRNAs in diseases. It will serve as an important and continually updated resource for investigating functions and mechanisms of lncRNAs and circRNAs in diseases.


Subject(s)
Databases, Nucleic Acid , Disease/genetics , Genome, Human , RNA, Circular/genetics , RNA, Long Noncoding/genetics , Regulatory Sequences, Nucleic Acid , Binding Sites , Chromatin/chemistry , Chromatin/metabolism , Deoxyribonuclease I/genetics , Deoxyribonuclease I/metabolism , Disease/classification , Humans , Internet , Linkage Disequilibrium , Molecular Sequence Annotation , Polymorphism, Single Nucleotide , Protein Binding , RNA, Circular/classification , RNA, Circular/metabolism , RNA, Long Noncoding/classification , RNA, Long Noncoding/metabolism , Software , Transcription Factors/genetics , Transcription Factors/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...