Your browser doesn't support javascript.
loading
LPI-SKMSC: Predicting LncRNA-Protein Interactions with Segmented k-mer Frequencies and Multi-space Clustering.
Sun, Dian-Zheng; Sun, Zhan-Li; Liu, Mengya; Yong, Shuang-Hao.
Afiliação
  • Sun DZ; School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, China.
  • Sun ZL; School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, China. zhlsun2006@126.com.
  • Liu M; School of Computer Science and Technology, Anhui University, Hefei, 230601, China.
  • Yong SH; School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, China.
Interdiscip Sci ; 2024 Jan 11.
Article em En | MEDLINE | ID: mdl-38206558
ABSTRACT
 Long noncoding RNAs (lncRNAs) have significant regulatory roles in gene expression. Interactions with proteins are one of the ways lncRNAs play their roles. Since experiments to determine lncRNA-protein interactions (LPIs) are expensive and time-consuming, many computational methods for predicting LPIs have been proposed as alternatives. In the LPIs prediction problem, there commonly exists the imbalance in the distribution of positive and negative samples. However, there are few existing methods that give specific consideration to this problem. In this paper, we proposed a new clustering-based LPIs prediction method using segmented k-mer frequencies and multi-space clustering (LPI-SKMSC). It was dedicated to handling the imbalance of positive and negative samples. We constructed segmented k-mer frequencies to obtain global and local features of lncRNA and protein sequences. Then, the multi-space clustering was applied to LPI-SKMSC. The convolutional neural network (CNN)-based encoders were used to map different features of a sample to different spaces. It used multiple spaces to jointly constrain the classification of samples. Finally, the distances between the output features of the encoder and the cluster center in each space were calculated. The sum of distances in all spaces was compared with the cluster radius to predict the LPIs. We performed cross-validation on 3 public datasets and LPI-SKMSC showed the best performance compared to other existing methods. Experimental results showed that LPI-SKMSC could predict LPIs more effectively when faced with imbalanced positive and negative samples. In addition, we illustrated that our model was better at uncovering potential lncRNA-protein interaction pairs.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Interdiscip Sci Assunto da revista: BIOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Interdiscip Sci Assunto da revista: BIOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China