LncSTPred: a predictive model of lncRNA subcellular localization and decipherment of the biological determinants influencing localization.
Front Mol Biosci
; 11: 1452142, 2024.
Article
en En
| MEDLINE
| ID: mdl-39301172
ABSTRACT
Introduction:
Long non-coding RNAs (lncRNAs) play crucial roles in genetic markers, genome rearrangement, chromatin modifications, and other biological processes. Increasing evidence suggests that lncRNA functions are closely related to their subcellular localization. However, the distribution of lncRNAs in different subcellular localizations is imbalanced. The number of lncRNAs located in the nucleus is more than ten times that in the exosome.Methods:
In this study, we propose a new oversampling method to construct a predictive dataset and develop a predictive model called LncSTPred. This model improves the Adaboost algorithm for subcellular localization prediction using 3-mer, 3-RF sequence, and minimum free energy structure features. Results andDiscussion:
By using our improved Adaboost algorithm, better prediction accuracy for lncRNA subcellular localization was obtained. In addition, we evaluated feature importance by using the F-score and analyzed the influence of highly relevant features on lncRNAs. Our study shows that the ANA features may be a key factor for predicting lncRNA subcellular localization, which correlates with the composition of stems and loops in the secondary structure of lncRNAs.
Texto completo:
1
Base de datos:
MEDLINE
Idioma:
En
Revista:
Front Mol Biosci
Año:
2024
Tipo del documento:
Article
País de afiliación:
China