Your browser doesn't support javascript.
loading
ncRNA Coding Potential Prediction Using BiLSTM and Transformer Encoder-Based Model.
Zhang, Jingpu; Lu, Hao; Jiang, Ying; Ma, Yuanyuan; Deng, Lei.
Affiliation
  • Zhang J; School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan 467000, China.
  • Lu H; School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan 467000, China.
  • Jiang Y; School of Computer Science and Engineering, Central South University, Changsha 410018, China.
  • Ma Y; School of Computer Engineering, Hubei University of Arts and Science, Xiangyang 441053, China.
  • Deng L; School of Computer Science and Engineering, Central South University, Changsha 410018, China.
J Chem Inf Model ; 64(16): 6712-6722, 2024 Aug 26.
Article de En | MEDLINE | ID: mdl-39120528
ABSTRACT
Many noncoding RNAs (ncRNAs) have been identified, and many of them play vital roles in various biological processes, including gene expression regulation, epigenetic regulation, transcription, and control. Recently, a few observations revealed that ncRNAs are translated into functional peptides. Moreover, many computational methods have been developed to predict the coding potential of these transcripts, which contributes to a deeper investigation of their functions. However, most of these are used to distinguish ncRNAs and mRNAs. It is important to develop a highly accurate computational tool for identifying the coding potential of ncRNAs, thereby contributing to the discovery of novel peptides. In this Article, we propose a novel BiLSTM And Transformer encoder-based model (nBAT) with intrinsic features encoded for ncRNA coding potential prediction. In nBAT, we introduce a learnable position encoding mechanism to better obtain the embeddings of the ncRNA sequence. Moreover, we extract 43 intrinsic features from different perspectives and encode these features into the Transformer encoder by calculating their distances. Our performance comparisons show that nBAT achieves a superior performance than the state-of-the-art methods for coding potential prediction on different datasets. We also apply the method to new ncRNAs for identifying the coding potential, and the results further indicate the competitive performance of nBAT. We expect the method can be exploited as a useful tool for high-throughput coding potential prediction for ncRNAs.
Sujet(s)

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Biologie informatique / ARN non traduit Limites: Humans Langue: En Journal: J Chem Inf Model Sujet du journal: INFORMATICA MEDICA / QUIMICA Année: 2024 Type de document: Article Pays d'affiliation: Chine Pays de publication: États-Unis d'Amérique

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Biologie informatique / ARN non traduit Limites: Humans Langue: En Journal: J Chem Inf Model Sujet du journal: INFORMATICA MEDICA / QUIMICA Année: 2024 Type de document: Article Pays d'affiliation: Chine Pays de publication: États-Unis d'Amérique