Your browser doesn't support javascript.
loading
Machine Learning Models to Interrogate Proteome-Wide Covalent Ligandabilities Directed at Cysteines.
Liu, Ruibin; Clayton, Joseph; Shen, Mingzhe; Bhatnagar, Shubham; Shen, Jana.
Afiliación
  • Liu R; Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States.
  • Clayton J; Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States.
  • Shen M; Division of Applied Regulatory Science, Office of Clinical Pharmacology, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland 20993, United States.
  • Bhatnagar S; Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States.
  • Shen J; Department of Computer Science, University of Maryland at College Park, College Park, Maryland 20742, United States.
JACS Au ; 4(4): 1374-1384, 2024 Apr 22.
Article en En | MEDLINE | ID: mdl-38665640
ABSTRACT
Machine learning (ML) identification of covalently ligandable sites may accelerate targeted covalent inhibitor design and help expand the druggable proteome space. Here, we report the rigorous development and validation of the tree-based models and convolutional neural networks (CNNs) trained on a newly curated database (LigCys3D) of over 1000 liganded cysteines in nearly 800 proteins represented by over 10,000 three-dimensional structures in the protein data bank. The unseen tests yielded 94 and 93% area under the receiver operating characteristic curves for the tree models and CNNs, respectively. Based on the AlphaFold2 predicted structures, the ML models recapitulated the newly liganded cysteines in the PDB with over 90% recall values. To assist the community of covalent drug discoveries, we report the predicted ligandable cysteines in 392 human kinases and their locations in the sequence-aligned kinase structure, including the PH and SH2 domains. Furthermore, we disseminate a searchable online database LigCys3D (https//ligcys.computchem.org/) and a web prediction server DeepCys (https//deepcys.computchem.org/), both of which will be continuously updated and improved by including newly published experimental data. The present work represents an important step toward the ML-led integration of big genome data and structure models to annotate the human proteome space for the next-generation covalent drug discoveries.

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Revista: JACS Au Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Banco de datos: MEDLINE Idioma: En Revista: JACS Au Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos