Your browser doesn't support javascript.
loading
A novel fusion technology utilizing complex network and sequence information for FAD-binding site identification.
Zhang, Lichao; Xiao, Kang; Wang, Xueting; Kong, Liang.
Afiliação
  • Zhang L; School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, PR China; Hebei Innovation Center for Smart Perception and Applied Technology of Agricultural Data, Qinhuangdao, PR China.
  • Xiao K; School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, PR China.
  • Wang X; School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, PR China.
  • Kong L; Hebei Innovation Center for Smart Perception and Applied Technology of Agricultural Data, Qinhuangdao, PR China; School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao, PR China. Electronic address: 2780@hevttc.edu.cn.
Anal Biochem ; 685: 115401, 2024 01 15.
Article em En | MEDLINE | ID: mdl-37981176
ABSTRACT
Flavin adenine dinucleotide (FAD) binding sites play an increasingly important role as useful targets for inhibiting bacterial infections. To reveal protein topological structural information as a reasonable complement for the identification FAD-binding sites, we designed a novel fusion technology according to sequence and complex network. The specially designed feature vectors were combined and fed into CatBoost for model construction. Moreover, due to the minority class (positive samples) is more significant for biological researches, a random under-sampling technique was applied to solve the imbalance. Compared with the previous methods, our methods achieved the best results for two independent test datasets. Especially, the MCC obtained by FADsite and FADsite_seq were 14.37 %-53.37 % and 21.81 %-60.81 % higher than the results of existing methods on Test6; and they showed improvements ranging from 6.03 % to 21.96 % and 19.77 %-35.70 % on Test4. Meanwhile, statistical tests show that our methods significantly differ from the state-of-the-art methods and the cross-entropy loss shows that our methods have high certainty. The excellent results demonstrated the effectiveness of using sequence and complex network information in identifying FAD-binding sites. It may be complementary to other biological studies. The data and resource codes are available at https//github.com/Kangxiaoneuq/FADsite.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas / Flavina-Adenina Dinucleotídeo Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Proteínas / Flavina-Adenina Dinucleotídeo Idioma: En Ano de publicação: 2024 Tipo de documento: Article