Your browser doesn't support javascript.
loading
HemoDL: Hemolytic peptides prediction by double ensemble engines from Rich sequence-derived and transformer-enhanced information.
Yang, Sen; Xu, Piao.
Afiliação
  • Yang S; School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China; The Affiliated Changzhou No.2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China.
  • Xu P; College of Economics and Management, Nanjing Forestry University, China. Electronic address: xupiao@njfu.edu.cn.
Anal Biochem ; 690: 115523, 2024 Jul.
Article em En | MEDLINE | ID: mdl-38552762
ABSTRACT
Hemolytic peptides can trigger hemolysis by rupturing red blood cells' membranes and triggering cell disruption. Due to the labor-intensive and time-consuming in-lab identification process, accurate, high-throughput hemolytic peptide prediction is crucial for the growth of peptide sequence data in proteomics and peptidomics. In this study, we offer the HemoDL ensemble learning model, which learns the distinct distribution of sequence characteristics for predicting the hemolytic activity of peptides using a double LightGBM framework. To determine the most informative encoding features, we compare 17 widely used features across four benchmark datasets. Our investigation reveals that CTD, BPF, Charge, AAC, GDPC, ATC, QSO, and transformer-based features exhibit more positive contributions to detecting the hemolytic activity of peptides. Comparison with eight state-of-the-art methods demonstrates that HemoDL outperforms other models, attaining higher Matthews Correlation Coefficient values on four test datasets, ranging from 6.30% to 16.04%, 6.63%-11.26%, 4.76%-9.92%, and 7.41%-15.03%, respectively. Additionally, we provide the HemoDL with a user-friendly graphical interface available at https//github.com/abcair/HemoDL. In summary, the HemoDL model, leveraging CTD, BPF, Charge, AAC, GDPC, ATC, QSO and transformer-based encoding features within a double LightGBM learning framework, achieves high accuracy in predicting the hemolytic activity of peptides.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Revista: Anal Biochem Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Base de dados: MEDLINE Idioma: En Revista: Anal Biochem Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China