Your browser doesn't support javascript.
loading
UsIL-6: An unbalanced learning strategy for identifying IL-6 inducing peptides by undersampling technique.
Liao, Yan-Hong; Chen, Shou-Zhi; Bin, Yan-Nan; Zhao, Jian-Ping; Feng, Xin-Long; Zheng, Chun-Hou.
Afiliación
  • Liao YH; School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
  • Chen SZ; School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China.
  • Bin YN; School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China.
  • Zhao JP; School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China. Electronic address: jpzhao@xju.edu.cn.
  • Feng XL; School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China. Electronic address: fxlmath@xju.edu.cn.
  • Zheng CH; School of Mathematics and System Science, Xinjiang University, Urumqi, Xinjiang 830017, China; School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China.
Comput Methods Programs Biomed ; 250: 108176, 2024 Jun.
Article en En | MEDLINE | ID: mdl-38677081
ABSTRACT
BACKGROUND AND

OBJECTIVE:

Interleukin-6 (IL-6) is the critical factor of early warning, monitoring, and prognosis in the inflammatory storm of COVID-19 cases. IL-6 inducing peptides, which can induce cytokine IL-6 production, are very important for the development of diagnosis and immunotherapy. Although the existing methods have some success in predicting IL-6 inducing peptides, there is still room for improvement in the performance of these models in practical application.

METHODS:

In this study, we proposed UsIL-6, a high-performance bioinformatics tool for identifying IL-6 inducing peptides. First, we extracted five groups of physicochemical properties and sequence structural information from IL-6 inducing peptide sequences, and obtained a 636-dimensional feature vector, we also employed NearMiss3 undersampling method and normalization method StandardScaler to process the data. Then, a 40-dimensional optimal feature vector was obtained by Boruta feature selection method. Finally, we combined this feature vector with extreme randomization tree classifier to build the final model UsIL-6.

RESULTS:

The AUC value of UsIL-6 on the independent test dataset was 0.87, and the BACC value was 0.808, which indicated that UsIL-6 had better performance than the existing methods in IL-6 inducing peptide recognition.

CONCLUSIONS:

The performance comparison on independent test dataset confirmed that UsIL-6 could achieve the highest performance, best robustness, and most excellent generalization ability. We hope that UsIL-6 will become a valuable method to identify, annotate and characterize new IL-6 inducing peptides.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Péptidos / Interleucina-6 / Biología Computacional Límite: Humans Idioma: En Revista: Comput Methods Programs Biomed Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Péptidos / Interleucina-6 / Biología Computacional Límite: Humans Idioma: En Revista: Comput Methods Programs Biomed Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: China
...