Your browser doesn't support javascript.
loading
A novel data augmentation approach for influenza A subtype prediction based on HA proteins.
Sohrabi, Mohammad Amin; Zare-Mirakabad, Fatemeh; Ghidary, Saeed Shiri; Saadat, Mahsa; Sadegh-Zadeh, Seyed-Ali.
Afiliación
  • Sohrabi MA; Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
  • Zare-Mirakabad F; Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
  • Ghidary SS; Department of Computing, School of Digital, Technologies, and Arts, Staffordshire University, Stoke-On-Trent, UK.
  • Saadat M; Computational Biology Research Center (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
  • Sadegh-Zadeh SA; Department of Computing, School of Digital, Technologies, and Arts, Staffordshire University, Stoke-On-Trent, UK. Electronic address: ali.sadegh-zadeh@staffs.ac.uk.
Comput Biol Med ; 172: 108316, 2024 Apr.
Article en En | MEDLINE | ID: mdl-38503091
ABSTRACT
Influenza, a pervasive viral respiratory illness, remains a significant global health concern. The influenza A virus, capable of causing pandemics, necessitates timely identification of specific subtypes for effective prevention and control, as highlighted by the World Health Organization. The genetic diversity of influenza A virus, especially in the hemagglutinin protein, presents challenges for accurate subtype prediction. This study introduces PreIS as a novel pipeline utilizing advanced protein language models and supervised data augmentation to discern subtle differences in hemagglutinin protein sequences. PreIS demonstrates two key contributions leveraging pre-trained protein language models for influenza subtype classification and utilizing supervised data augmentation to generate additional training data without extensive annotations. The effectiveness of the pipeline has been rigorously assessed through extensive experiments, demonstrating a superior performance with an impressive accuracy of 94.54% compared to the current state-of-the-art model, the MC-NN model, which achieves an accuracy of 89.6%. PreIS also exhibits proficiency in handling unknown subtypes, emphasizing the importance of early detection. Pioneering the classification of HxNy subtypes solely based on the hemagglutinin protein chain, this research sets a benchmark for future studies. These findings promise more precise and timely influenza subtype prediction, enhancing public health preparedness against influenza outbreaks and pandemics. The data and code underlying this article are available in https//github.com/CBRC-lab/PreIS.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Virus de la Influenza A / Gripe Humana Límite: Humans Idioma: En Revista: Comput Biol Med Año: 2024 Tipo del documento: Article País de afiliación: Irán

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Virus de la Influenza A / Gripe Humana Límite: Humans Idioma: En Revista: Comput Biol Med Año: 2024 Tipo del documento: Article País de afiliación: Irán