Your browser doesn't support javascript.
loading
PLM-ARG: antibiotic resistance gene identification using a pretrained protein language model.
Wu, Jun; Ouyang, Jian; Qin, Haipeng; Zhou, Jiajia; Roberts, Ruth; Siam, Rania; Wang, Lan; Tong, Weida; Liu, Zhichao; Shi, Tieliu.
Afiliação
  • Wu J; Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China.
  • Ouyang J; Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China.
  • Qin H; Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China.
  • Zhou J; Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China.
  • Roberts R; ApconiX Ltd, Alderley Park, Alderley Edge SK10 4TG, United Kingdom.
  • Siam R; University of Birmingham, Birmingham B15 2TT, United Kingdom.
  • Wang L; Biology Department, School of Sciences and Engineering, The American University in Cairo, New Cairo 11835, Egypt.
  • Tong W; College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China.
  • Liu Z; National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, United States.
  • Shi T; Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc, Ridgefield, CT 06877, United States.
Bioinformatics ; 39(11)2023 11 01.
Article em En | MEDLINE | ID: mdl-37995287
ABSTRACT
MOTIVATION Antibiotic resistance presents a formidable global challenge to public health and the environment. While considerable endeavors have been dedicated to identify antibiotic resistance genes (ARGs) for assessing the threat of antibiotic resistance, recent extensive investigations using metagenomic and metatranscriptomic approaches have unveiled a noteworthy concern. A significant fraction of proteins defies annotation through conventional sequence similarity-based methods, an issue that extends to ARGs, potentially leading to their under-recognition due to dissimilarities at the sequence level.

RESULTS:

Herein, we proposed an Artificial Intelligence-powered ARG identification framework using a pretrained large protein language model, enabling ARG identification and resistance category classification simultaneously. The proposed PLM-ARG was developed based on the most comprehensive ARG and related resistance category information (>28K ARGs and associated 29 resistance categories), yielding Matthew's correlation coefficients (MCCs) of 0.983 ± 0.001 by using a 5-fold cross-validation strategy. Furthermore, the PLM-ARG model was verified using an independent validation set and achieved an MCC of 0.838, outperforming other publicly available ARG prediction tools with an improvement range of 51.8%-107.9%. Moreover, the utility of the proposed PLM-ARG model was demonstrated by annotating resistance in the UniProt database and evaluating the impact of ARGs on the Earth's environmental microbiota. AVAILABILITY AND IMPLEMENTATION PLM-ARG is available for academic purposes at https//github.com/Junwu302/PLM-ARG, and a user-friendly webserver (http//www.unimd.org/PLM-ARG) is also provided.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Antibacterianos Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Antibacterianos Idioma: En Ano de publicação: 2023 Tipo de documento: Article