Does protein pretrained language model facilitate the prediction of protein-ligand interaction?
Methods
; 219: 8-15, 2023 11.
Article
en En
| MEDLINE
| ID: mdl-37690736
ABSTRACT
Protein-ligand interaction (PLI) is a critical step for drug discovery. Recently, protein pretrained language models (PLMs) have showcased exceptional performance across a wide range of protein-related tasks. However, a significant heterogeneity exists between the PLM and PLI tasks, leading to a degree of uncertainty. In this study, we propose a method that quantitatively assesses the significance of protein PLMs in PLI prediction. Specifically, we analyze the performance of three widely-used protein PLMs (TAPE, ESM-1b, and ProtTrans) on three PLI tasks (PDBbind, Kinase, and DUD-E). The model with pre-training consistently achieves improved performance and decreased time cost, demonstrating that enhance both the accuracy and efficiency of PLI prediction. By quantitatively assessing the transferability, the optimal PLM for each PLI task is identified without the need for costly transfer experiments. Additionally, we examine the contributions of PLMs on the distribution of feature space, highlighting the improved discriminability after pre-training. Our findings provide insights into the mechanisms underlying PLMs in PLI prediction and pave the way for the design of more interpretable and accurate PLMs in the future. Code and data are freely available at https//github.com/brian-zZZ/PLM-PLI.
Palabras clave
Texto completo:
1
Banco de datos:
MEDLINE
Asunto principal:
Proteínas
/
Lenguaje
Tipo de estudio:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Revista:
Methods
Asunto de la revista:
BIOQUIMICA
Año:
2023
Tipo del documento:
Article
País de afiliación:
China