PREHOST: Host prediction of coronaviridae family using machine learning.
Heliyon
; 9(2): e13646, 2023 Feb.
Article
em En
| MEDLINE
| ID: mdl-36816252
ABSTRACT
Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline PreHost for host prediction of the family, Coronaviridae. We leverage the complete viral genome and sequences at the protein level (spike protein, membrane protein, and nucleocapsid protein). Compared with the current state-of-the-art approaches, the random forest model attained high accuracy and recall scores of 99.91% and 0.98, respectively, for genome sequences. In addition to the spike protein sequences, our study shows membrane and nucleocapsid protein sequences can be utilized to predict the host of viruses. We also identified important sites in the viral sequences that help distinguish between different host classes. The host prediction pipeline PreHost will cater as a valuable tool to take effective measures to govern the transmission of future viruses.
Texto completo:
1
Base de dados:
MEDLINE
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Ano de publicação:
2023
Tipo de documento:
Article