Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38581420

RESUMEN

Protein-ligand interaction prediction presents a significant challenge in drug design. Numerous machine learning and deep learning (DL) models have been developed to accurately identify docking poses of ligands and active compounds against specific targets. However, current models often suffer from inadequate accuracy or lack practical physical significance in their scoring systems. In this research paper, we introduce IGModel, a novel approach that utilizes the geometric information of protein-ligand complexes as input for predicting the root mean square deviation of docking poses and the binding strength (pKd, the negative value of the logarithm of binding affinity) within the same prediction framework. This ensures that the output scores carry intuitive meaning. We extensively evaluate the performance of IGModel on various docking power test sets, including the CASF-2016 benchmark, PDBbind-CrossDocked-Core and DISCO set, consistently achieving state-of-the-art accuracies. Furthermore, we assess IGModel's generalizability and robustness by evaluating it on unbiased test sets and sets containing target structures generated by AlphaFold2. The exceptional performance of IGModel on these sets demonstrates its efficacy. Additionally, we visualize the latent space of protein-ligand interactions encoded by IGModel and conduct interpretability analysis, providing valuable insights. This study presents a novel framework for DL-based prediction of protein-ligand interactions, contributing to the advancement of this field. The IGModel is available at GitHub repository https://github.com/zchwang/IGModel.


Asunto(s)
Aprendizaje Profundo , Proteínas , Proteínas/química , Unión Proteica , Ligandos , Diseño de Fármacos
2.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36502369

RESUMEN

The recently reported machine learning- or deep learning-based scoring functions (SFs) have shown exciting performance in predicting protein-ligand binding affinities with fruitful application prospects. However, the differentiation between highly similar ligand conformations, including the native binding pose (the global energy minimum state), remains challenging that could greatly enhance the docking. In this work, we propose a fully differentiable, end-to-end framework for ligand pose optimization based on a hybrid SF called DeepRMSD+Vina combined with a multi-layer perceptron (DeepRMSD) and the traditional AutoDock Vina SF. The DeepRMSD+Vina, which combines (1) the root mean square deviation (RMSD) of the docking pose with respect to the native pose and (2) the AutoDock Vina score, is fully differentiable; thus is capable of optimizing the ligand binding pose to the energy-lowest conformation. Evaluated by the CASF-2016 docking power dataset, the DeepRMSD+Vina reaches a success rate of 94.4%, which outperforms most reported SFs to date. We evaluated the ligand conformation optimization framework in practical molecular docking scenarios (redocking and cross-docking tasks), revealing the high potentialities of this framework in drug design and discovery. Structural analysis shows that this framework has the ability to identify key physical interactions in protein-ligand binding, such as hydrogen-bonding. Our work provides a paradigm for optimizing ligand conformations based on deep learning algorithms. The DeepRMSD+Vina model and the optimization framework are available at GitHub repository https://github.com/zchwang/DeepRMSD-Vina_Optimization.


Asunto(s)
Aprendizaje Profundo , Ligandos , Simulación del Acoplamiento Molecular , Proteínas/química , Diseño de Fármacos , Unión Proteica
3.
Brief Bioinform ; 23(3)2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35289359

RESUMEN

Scoring functions are important components in molecular docking for structure-based drug discovery. Traditional scoring functions, generally empirical- or force field-based, are robust and have proven to be useful for identifying hits and lead optimizations. Although multiple highly accurate deep learning- or machine learning-based scoring functions have been developed, their direct applications for docking and screening are limited. We describe a novel strategy to develop a reliable protein-ligand scoring function by augmenting the traditional scoring function Vina score using a correction term (OnionNet-SFCT). The correction term is developed based on an AdaBoost random forest model, utilizing multiple layers of contacts formed between protein residues and ligand atoms. In addition to the Vina score, the model considerably enhances the AutoDock Vina prediction abilities for docking and screening tasks based on different benchmarks (such as cross-docking dataset, CASF-2016, DUD-E and DUD-AD). Furthermore, our model could be combined with multiple docking applications to increase pose selection accuracies and screening abilities, indicating its wide usage for structure-based drug discoveries. Furthermore, in a reverse practice, the combined scoring strategy successfully identified multiple known receptors of a plant hormone. To summarize, the results show that the combination of data-driven model (OnionNet-SFCT) and empirical scoring function (Vina score) is a good scoring strategy that could be useful for structure-based drug discoveries and potentially target fishing in future.


Asunto(s)
Descubrimiento de Drogas , Proteínas , Descubrimiento de Drogas/métodos , Ligandos , Aprendizaje Automático , Simulación del Acoplamiento Molecular , Unión Proteica , Proteínas/química
4.
J Chem Inf Model ; 64(15): 6205-6215, 2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39074901

RESUMEN

Accurate protein-ligand binding poses are the prerequisites of structure-based binding affinity prediction and provide the structural basis for in-depth lead optimization in small molecule drug design. However, it is challenging to provide reasonable predictions of binding poses for different molecules due to the complexity and diversity of the chemical space of small molecules. Similarity-based molecular alignment techniques can effectively narrow the search range, as structurally similar molecules are likely to have similar binding modes, with higher similarity usually correlated to higher success rates. However, molecular similarity is not consistently high because molecules often require changes to achieve specific purposes, leading to reduced alignment precision. To address this issue, we propose a new alignment method─Z-align. This method uses topological structural information as a criterion for evaluating similarity, reducing the reliance on molecular fingerprint similarity. Our method has achieved success rates significantly higher than those of other methods at moderate levels of similarity. Additionally, our approach can comprehensively and flexibly optimize bond lengths and angles of molecules, maintaining a high accuracy even when dealing with larger molecules. Consequently, our proposed solution helps in achieving more accurate binding poses in protein-ligand docking problems, facilitating the development of small molecule drugs. Z-align is freely available as a web server at https://cloud.zelixir.com/zalign/home.


Asunto(s)
Simulación del Acoplamiento Molecular , Proteínas , Ligandos , Proteínas/química , Proteínas/metabolismo , Unión Proteica , Diseño de Fármacos , Conformación Proteica , Sitios de Unión
5.
Proteins ; 91(12): 1837-1849, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37606194

RESUMEN

We introduce a deep learning-based ligand pose scoring model called zPoseScore for predicting protein-ligand complexes in the 15th Critical Assessment of Protein Structure Prediction (CASP15). Our contributions are threefold: first, we generate six training and evaluation data sets by employing advanced data augmentation and sampling methods. Second, we redesign the "zFormer" module, inspired by AlphaFold2's Evoformer, to efficiently describe protein-ligand interactions. This module enables the extraction of protein-ligand paired features that lead to accurate predictions. Finally, we develop the zPoseScore framework with zFormer for scoring and ranking ligand poses, allowing for atomic-level protein-ligand feature encoding and fusion to output refined ligand poses and ligand per-atom deviations. Our results demonstrate excellent performance on various testing data sets, achieving Pearson's correlation R = 0.783 and 0.659 for ranking docking decoys generated based on experimental and predicted protein structures of CASF-2016 protein-ligand complexes. Additionally, we obtain an averaged local distance difference test (lDDT pli = 0.558) of AIchemy LIG2 in CASP15 for de novo protein-ligand complex structure predictions. Detailed analysis shows that accurate ligand binding site prediction and side-chain orientation are crucial for achieving better prediction performance. Our proposed model is one of the most accurate protein-ligand pose prediction models and could serve as a valuable tool in small molecule drug discovery.


Asunto(s)
Proteínas , Ligandos , Unión Proteica , Proteínas/química , Sitios de Unión , Simulación del Acoplamiento Molecular
6.
Phys Chem Chem Phys ; 24(7): 4324-4333, 2022 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-35107451

RESUMEN

The COVID-19 pandemic caused by SARS-CoV-2 has been declared a global health crisis. The development of anti-SARS-CoV-2 drugs heavily depends on the systematic study of the critical biological processes of key proteins of coronavirus among which the main proteinase (Mpro) dimerization is a key step for virus maturation. Because inhibiting the Mpro dimerization can efficiently suppress virus maturation, the key residues that mediate dimerization can be treated as targets of drug and antibody developments. In this work, the structure and energy features of the Mpro dimer of SARS-CoV-2 and SARS-CoV were studied using molecular dynamics (MD) simulations. The free energy calculations using the Generalized Born (GB) model showed that the dimerization free energy of the SARS-CoV-2 Mpro dimer (-107.5 ± 10.89 kcal mol-1) is larger than that of the SARS-CoV Mpro dimer (-92.83 ± 9.81 kcal mol-1), indicating a more stable and possibly a quicker formation of the Mpro dimer of SARS-CoV-2. In addition, the energy decomposition of each residue revealed 11 key attractive residues. Furthermore, Thr285Ala weakens the steric hindrance between the two protomers of SARS-CoV-2 that can form more intimate interactions. It is interesting to find 11 repulsive residues which effectively inhibit the dimerization process. At the interface of the Mpro dimer, we detected three regions that are rich in interfacial water which stabilize the SARS-CoV-2 Mpro dimer by forming hydrogen bonds with two protomers. The key residues and rich water regions provide important targets for the future design of anti-SARS-CoV-2 drugs through inhibiting Mpro dimerization.


Asunto(s)
Proteasas 3C de Coronavirus/química , SARS-CoV-2/enzimología , COVID-19 , Humanos , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , Pandemias , Multimerización de Proteína
7.
Proteins ; 89(12): 1901-1910, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34473376

RESUMEN

In this paper, we report our tFold framework's performance on the inter-residue contact prediction task in the 14th Critical Assessment of protein Structure Prediction (CASP14). Our tFold framework seamlessly combines both homologous sequences and structural decoys under an ultra-deep network architecture. Squeeze-excitation and axial attention mechanisms are employed to effectively capture inter-residue interactions. In CASP14, our best predictor achieves 41.78% in the averaged top-L precision for long-range contacts for all the 22 free-modeling (FM) targets, and ranked 1st among all the 60 participating teams. The tFold web server is now freely available at: https://drug.ai.tencent.com/console/en/tfold.


Asunto(s)
Redes Neurales de la Computación , Pliegue de Proteína , Proteínas , Programas Informáticos , Homología Estructural de Proteína , Biología Computacional , Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , Reproducibilidad de los Resultados , Análisis de Secuencia de Proteína
8.
Biochemistry ; 58(36): 3777-3788, 2019 09 10.
Artículo en Inglés | MEDLINE | ID: mdl-31424191

RESUMEN

Recognition of RNAs under physiological conditions is important for the development of chemical probes and therapeutic ligands. Nucleobase-modified dsRNA-binding PNAs (dbPNAs) are promising for the recognition of dsRNAs in a sequence and structure specific manner under near-physiological conditions. Guanidinium is often present in proteins and small molecules for the recognition of G bases in nucleic acids, in cell-penetrating carriers, and in bioactive drug molecules, which might be due to the fact that guanidinium is amphiphilic and has unique hydrogen bonding and stacking properties. We hypothesized that a simple guanidinium moiety can be directly incorporated into PNAs to facilitate enhanced molecular recognition of G-C pairs in dsRNAs and improved bioactivity. We grafted a guanidinium moiety directly into a PNA monomer (designated as R) using a two-carbon linker as guided by computational modeling studies. The synthetic scheme of the PNA R monomer is relatively simple compared to that of the previously reported L monomer. We incorporated the R residue into various dbPNAs for binding studies. dbPNAs incorporated with R residues are excellent in sequence specifically recognizing G-C pairs in dsRNAs over dsDNA and ssRNAs. We demonstrated that the R residue is compatible with unmodified T and C and previously developed modified L and Q residues in dbPNAs for targeting model dsRNAs, the influenza A viral panhandle duplex structure, and the HIV-1 frameshift site RNA hairpin. Furthermore, R residues enhance the cellular uptake of PNAs.


Asunto(s)
ADN/metabolismo , Guanidinas/química , Ácidos Nucleicos de Péptidos/metabolismo , ARN Bicatenario/metabolismo , Animales , Emparejamiento Base , Transporte Biológico , ADN/genética , VIH-1/química , Enlace de Hidrógeno , Concentración de Iones de Hidrógeno , Modelos Moleculares , Conformación de Ácido Nucleico , Orthomyxoviridae/química , Ácidos Nucleicos de Péptidos/química , Ácidos Nucleicos de Péptidos/genética , ARN Bicatenario/genética , ARN Viral/genética , ARN Viral/metabolismo , Spodoptera/química
9.
Phys Chem Chem Phys ; 20(43): 27439-27448, 2018 Nov 07.
Artículo en Inglés | MEDLINE | ID: mdl-30357163

RESUMEN

CRISPR-Cas9, a powerful genome editing tool, has widely been applied in biological fields. Since the discovery of CRISPR-Cas9 as an adaptive immune system, it has been gradually modified to perform precise genome editing in eukaryotic cells by creating double-strand breaks. Although it is robust and efficient, the current CRISPR-Cas9 system faces a major flaw: off-target effects, which are not well understood. Several Cas9 mutants show significant improvement, with very low off-target effects; however, they also show relatively lower cleavage efficiency for on-target sequences. In this study, the dynamics of wild-type Cas9 from Streptococcus pyogenes and a high fidelity Cas9 mutant have been explored using molecular dynamics simulations. It was found that the mutations cause decreased electrostatic interactions between Cas9 and the R-loop. Consequently, the flexibility of the tDNA/sgRNA heteroduplex is decreased, which may explain the lower tolerance of mismatches in the heteroduplex region. The mutations also affect the protein dynamics and the correlation networks among Cas9 domains. In mutant Cas9, weakened communications between two catalytic domains as well as a slight opening of the conformation induced by the mutations account for the lower on-target cleavage efficiency and probably the lower off-target efficiency as well. These findings will facilitate more precise Cas9 engineering in future.


Asunto(s)
Proteínas Bacterianas/metabolismo , Endonucleasas/metabolismo , Mutación , Proteínas Bacterianas/genética , Proteína 9 Asociada a CRISPR , Sistemas CRISPR-Cas , Endonucleasas/genética , Unión Proteica/genética , Dominios Proteicos , Electricidad Estática
10.
J Hazard Mater ; 475: 134828, 2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-38876015

RESUMEN

The prediction of ecological toxicity plays an increasingly important role in modern society. However, the existing models often suffer from poor performance and limited predictive capabilities. In this study, we propose a novel approach for ecological toxicity assessment based on pre-trained models. By leveraging pre-training techniques and graph neural network models, we establish a highperformance predictive model. Furthermore, we incorporate a variational autoencoder to optimize the model, enabling simultaneous discrimination of toxicity to bees and molecular degradability. Additionally, despite the low similarity between the endogenous hormones in bees and the compounds in our dataset, our model confidently predicts that these hormones are non-toxic to bees, which further strengthens the credibility and accuracy of our model. We also discovered the negative correlation between the degradation and bee toxicity of compounds. In summary, this study presents an ecological toxicity assessment model with outstanding performance. The proposed model accurately predicts the toxicity of chemicals to bees and their degradability capabilities, offering valuable technical support to relevant fields.


Asunto(s)
Redes Neurales de la Computación , Abejas/efectos de los fármacos , Animales , Ecotoxicología , Pruebas de Toxicidad
11.
Int J Biol Macromol ; 278(Pt 4): 135064, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39182884

RESUMEN

Enzyme specificity towards cofactors like NAD(P)H is crucial for applications in bioremediation and eco-friendly chemical synthesis. Despite their role in converting pollutants and creating sustainable products, predicting enzyme specificity faces challenges due to sparse data and inadequate models. To bridge this gap, we developed the cutting-edge INSIGHT platform to enhance the prediction of coenzyme specificity in NAD(P)-dependent enzymes. INSIGHT integrates extensive data from principal bioinformatics resources, concentrating on both NADH and NADPH specificities, and utilizes advanced protein language models to refine the predictions. This integration not only strengthens computational predictions but also meets the practical demands of high-throughput screening and optimization. Experimental validation confirms INSIGHT's effectiveness, boosting our ability to engineer enzymes for efficient, sustainable industrial and environmental processes. This work advances the practical use of computational tools in enzyme research, addressing industrial needs and offering scalable solutions for environmental challenges.


Asunto(s)
NADP , NAD , Ingeniería de Proteínas , NADP/metabolismo , NADP/química , Especificidad por Sustrato , NAD/metabolismo , NAD/química , Ingeniería de Proteínas/métodos , Biología Computacional/métodos , Modelos Moleculares , Coenzimas/metabolismo , Coenzimas/química
12.
Commun Biol ; 7(1): 586, 2024 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-38755285

RESUMEN

Bats serve as reservoirs for numerous zoonotic viruses, yet they typically remain asymptomatic owing to their unique immune system. Of particular significance is the MHC-I in bats, which plays crucial role in anti-viral response and exhibits polymorphic amino acid (AA) insertions. This study demonstrated that both 5AA and 3AA insertions enhance the thermal stability of the bat MHC-I complex and enrich the diversity of bound peptides in terms of quantity and length distribution, by stabilizing the 310 helix, a region prone to conformational changes during peptide loading. However, the mismatched insertion could diminish the stability of bat pMHC-I. We proposed that a suitable insertion may help bat MHC-I adapt to high body temperatures during flight while enhancing antiviral responses. Moreover, this site-specific insertions may represent a strategy of evolutionary adaptation of MHC-I molecules to fluctuations in body temperature, as similar insertions have been found in other lower vertebrates.


Asunto(s)
Quirópteros , Antígenos de Histocompatibilidad Clase I , Animales , Antígenos de Histocompatibilidad Clase I/metabolismo , Antígenos de Histocompatibilidad Clase I/química , Antígenos de Histocompatibilidad Clase I/genética , Estabilidad Proteica , Péptidos/química , Péptidos/metabolismo , Aminoácidos/química , Aminoácidos/metabolismo , Presentación de Antígeno , Mutagénesis Insercional
13.
Protein Sci ; 33(10): e5167, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39276010

RESUMEN

Predicting the binding of ligands to the human proteome via reverse-docking methods enables the understanding of ligand's interactions with potential protein targets in the human body, thereby facilitating drug repositioning and the evaluation of potential off-target effects or toxic side effects of drugs. In this study, we constructed 11 reverse docking pipelines by integrating site prediction tools (PointSite and SiteMap), docking programs (Glide and AutoDock Vina), and scoring functions (Glide, Autodock Vina, RTMScore, DeepRMSD, and OnionNet-SFCT), and then thoroughly benchmarked their predictive capabilities. The results show that the Glide_SFCT (PS) pipeline exhibited the best target prediction performance based on the atomic structure models in AlphaFold2 human proteome. It achieved a success rate of 27.8% when considering the top 100 ranked prediction. This pipeline effectively narrows the range of potential targets within the human proteome, laying a foundation for drug target prediction, off-target assessment, and toxicity prediction, ultimately boosting drug development. By facilitating these critical aspects of drug discovery and development, our work has the potential to ultimately accelerate the identification of new therapeutic agents and improve drug safety.


Asunto(s)
Simulación del Acoplamiento Molecular , Proteoma , Humanos , Proteoma/química , Proteoma/metabolismo , Benchmarking , Programas Informáticos , Ligandos , Unión Proteica , Conformación Proteica
14.
Protein Sci ; 33(9): e5097, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39145402

RESUMEN

Disulfide bonds, covalently formed by sulfur atoms in cysteine residues, play a crucial role in protein folding and structure stability. Considering their significance, artificial disulfide bonds are often introduced to enhance protein thermostability. Although an increasing number of tools can assist with this task, significant amounts of time and resources are often wasted owing to inadequate consideration. To enhance the accuracy and efficiency of designing disulfide bonds for protein thermostability improvement, we initially collected disulfide bond and protein thermostability data from extensive literature sources. Thereafter, we extracted various sequence- and structure-based features and constructed machine-learning models to predict whether disulfide bonds can improve protein thermostability. Among all models, the neighborhood context model based on the Adaboost-DT algorithm performed the best, yielding "area under the receiver operating characteristic curve" and accuracy scores of 0.773 and 0.714, respectively. Furthermore, we also found AlphaFold2 to exhibit high superiority in predicting disulfide bonds, and to some extent, the coevolutionary relationship between residue pairs potentially guided artificial disulfide bond design. Moreover, several mutants of imine reductase 89 (IR89) with artificially designed thermostable disulfide bonds were experimentally proven to be considerably efficient for substrate catalysis. The SS-bond data have been integrated into an online server, namely, ThermoLink, available at guolab.mpu.edu.mo/thermoLink.


Asunto(s)
Disulfuros , Aprendizaje Automático , Disulfuros/química , Bases de Datos de Proteínas , Estabilidad de Enzimas , Modelos Moleculares , Pliegue de Proteína
15.
Nat Biotechnol ; 2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39123049

RESUMEN

The identification of protein homologs in large databases using conventional methods, such as protein sequence comparison, often misses remote homologs. Here, we offer an ultrafast, highly sensitive method, dense homolog retriever (DHR), for detecting homologs on the basis of a protein language model and dense retrieval techniques. Its dual-encoder architecture generates different embeddings for the same protein sequence and easily locates homologs by comparing these representations. Its alignment-free nature improves speed and the protein language model incorporates rich evolutionary and structural information within DHR embeddings. DHR achieves a >10% increase in sensitivity compared to previous methods and a >56% increase in sensitivity at the superfamily level for samples that are challenging to identify using alignment-based approaches. It is up to 22 times faster than traditional methods such as PSI-BLAST and DIAMOND and up to 28,700 times faster than HMMER. The new remote homologs exclusively found by DHR are useful for revealing connections between well-characterized proteins and improving our knowledge of protein evolution, structure and function.

16.
Gut Microbes ; 15(1): 2227434, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37349961

RESUMEN

A demonstration of cellulose degrading bacterium from human gut changed our view that human cannot degrade the cellulose. However, investigation of cellulose degradation by human gut microbiota on molecular level has not been completed so far. We showed here, using cellobiose as a model that promoted the growth of human gut key members, such as Bacteroides ovatus (BO), to clarify the molecular mechanism. Our results showed that a new polysaccharide utilization locus (PUL) from BO was involved in the cellobiose capturing and degradation. Further, two new cellulases BACOVA_02626GH5 and BACOVA_02630GH5 on the cell surface performed the degradation of cellobiose into glucose were determined. The predicted structures of BACOVA_02626GH5 and BACOVA_02630GH5 were highly homologous with the cellulase from soil bacteria, and the catalytic residues were highly conservative with two glutamate residues. In murine experiment, we observed cellobiose reshaped the composition of gut microbiota and probably modified the metabolic function of bacteria. Taken together, our findings further highlight the evidence of cellulose can be degraded by human gut microbes and provide new insight in the field of investigation on cellulose.


Asunto(s)
Celobiosa , Microbioma Gastrointestinal , Humanos , Animales , Ratones , Celobiosa/metabolismo , Celulosa/metabolismo , Polisacáridos/metabolismo
17.
Sci Data ; 9(1): 71, 2022 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-35241693

RESUMEN

Intrinsic solubility is a critical property in pharmaceutical industry that impacts in-vivo bioavailability of small molecule drugs. However, solubility prediction with Artificial Intelligence(AI) are facing insufficient data, poor data quality, and no unified measurements for AI and physics-based approaches. We collect 7 aqueous solubility datasets, and present a dataset curation workflow. Evaluating the curated data with two expanded deep learning methods, improved RMSE scores on all curated thermodynamic datasets are observed. We also compare expanded Chemprop enhanced with curated data and state-of-art physics-based approach using pearson and spearman correlation coefficients. A similar performance on pearson with 0.930 and spearman with 0.947 from expanded Chemprop is achieved. A steadily improved pearson and spearman values with increasing data points are also illustrated. Besides that, the computation advantage of AI models enables quick evaluation of a large set of molecules during the hit identification or lead optimization stages, which helps further decision making within the time cycle at drug discovery stage.

18.
Front Chem ; 9: 753002, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34778208

RESUMEN

One key task in virtual screening is to accurately predict the binding affinity (△G) of protein-ligand complexes. Recently, deep learning (DL) has significantly increased the predicting accuracy of scoring functions due to the extraordinary ability of DL to extract useful features from raw data. Nevertheless, more efforts still need to be paid in many aspects, for the aim of increasing prediction accuracy and decreasing computational cost. In this study, we proposed a simple scoring function (called OnionNet-2) based on convolutional neural network to predict △G. The protein-ligand interactions are characterized by the number of contacts between protein residues and ligand atoms in multiple distance shells. Compared to published models, the efficacy of OnionNet-2 is demonstrated to be the best for two widely used datasets CASF-2016 and CASF-2013 benchmarks. The OnionNet-2 model was further verified by non-experimental decoy structures from docking program and the CSAR NRC-HiQ data set (a high-quality data set provided by CSAR), which showed great success. Thus, our study provides a simple but efficient scoring function for predicting protein-ligand binding free energy.

19.
Front Chem ; 7: 315, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31134186

RESUMEN

Progesterone receptor (PR) is a member of the nuclear receptor (NR) superfamily and plays a vital role in the female reproductive system. The malfunction of it would lead to several types of cancers. The understanding of conformational changes in its ligand binding domain (LBD) is valuable for both biological function studies and therapeutically intervenes. A key unsolved question is how the binding of a ligand (agonist, antagonist, or a selective modulator) induces conformational changes of PR LBD, especially its helix 12. We applied molecular dynamics (MD) simulations to explore the conformational adaptations of PR LBD with or without a ligand or the co-repressor peptides binding. From the simulations, both the agonist progesterone (P4) and the selective PR modulator (SPRM) asoprisnil induces agonistic-like helix 12 conformations (the "closed" states) in PR LBD and the complex of LBD-SPRM is less stable, comparing to the agonist-liganded PR LBD. The results, therefore, explain the partial agonism of the SPRM, which could induce weak agonistic effects in PR. We also found that co-repressor peptides could be stably associated with the LBD and stabilize the LBD in a "semi-open" state for helix 12. These findings would enhance our understanding of PR structural and functional relationships and would also be useful for future structure and knowledge-based drug discovery.

20.
ACS Omega ; 4(14): 15956-15965, 2019 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-31592466

RESUMEN

Computational drug discovery provides an efficient tool for helping large-scale lead molecule screening. One of the major tasks of lead discovery is identifying molecules with promising binding affinities toward a target, a protein in general. The accuracies of current scoring functions that are used to predict the binding affinity are not satisfactory enough. Thus, machine learning or deep learning based methods have been developed recently to improve the scoring functions. In this study, a deep convolutional neural network model (called OnionNet) is introduced; its features are based on rotation-free element-pair-specific contacts between ligands and protein atoms, and the contacts are further grouped into different distance ranges to cover both the local and nonlocal interaction information between the ligand and the protein. The prediction power of the model is evaluated and compared with other scoring functions using the comparative assessment of scoring functions (CASF-2013) benchmark and the v2016 core set of the PDBbind database. The robustness of the model is further explored by predicting the binding affinities of the complexes generated from docking simulations instead of experimentally determined PDB structures.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA