Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 89
Filtrar
1.
Nat Commun ; 15(1): 7348, 2024 Aug 27.
Artículo en Inglés | MEDLINE | ID: mdl-39187482

RESUMEN

Annotating active sites in enzymes is crucial for advancing multiple fields including drug discovery, disease research, enzyme engineering, and synthetic biology. Despite the development of numerous automated annotation algorithms, a significant trade-off between speed and accuracy limits their large-scale practical applications. We introduce EasIFA, an enzyme active site annotation algorithm that fuses latent enzyme representations from the Protein Language Model and 3D structural encoder, and then aligns protein-level information with the knowledge of enzymatic reactions using a multi-modal cross-attention framework. EasIFA outperforms BLASTp with a 10-fold speed increase and improved recall, precision, f1 score, and MCC by 7.57%, 13.08%, 9.68%, and 0.1012, respectively. It also surpasses empirical-rule-based algorithm and other state-of-the-art deep learning annotation method based on PSSM features, achieving a speed increase ranging from 650 to 1400 times while enhancing annotation quality. This makes EasIFA a suitable replacement for conventional tools in both industrial and academic settings. EasIFA can also effectively transfer knowledge gained from coarsely annotated enzyme databases to smaller, high-precision datasets, highlighting its ability to model sparse and high-quality databases. Additionally, EasIFA shows potential as a catalytic site monitoring tool for designing enzymes with desired functions beyond their natural distribution.


Asunto(s)
Algoritmos , Dominio Catalítico , Aprendizaje Profundo , Enzimas , Enzimas/metabolismo , Enzimas/química , Bases de Datos de Proteínas , Anotación de Secuencia Molecular/métodos , Biología Computacional/métodos
2.
Indian J Dermatol ; 69(3): 264-267, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39119326

RESUMEN

Biologics have expanded the armamentarium for psoriasis, but there has been a growing concern about the risk of lymphoma in patients under tumour necrosis factor (TNF)-α inhibitor and methotrexate. Besides, the mRNA-based coronavirus disease 2019 (COVID-19) vaccination was known to stimulate the proliferation of T-follicular helper cells. We report a case of a patient with psoriasis under adalimumab developing nodal T-follicular helper cell lymphoma, angioimmunoblastic-type following the mRNA-1273 COVID-19 vaccine. We suspect that adalimumab, methotrexate, Epstein-Barr virus (EBV) reactivation, previous reactive lymphoid hyperplasia and psoriasis per se predispose our patient to a lymphoma-prone condition, and the two doses of the mRNA vaccine act as the last straw.

3.
Anal Chem ; 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-39011990

RESUMEN

Analyzing drug-related interactions in the field of biomedicine has been a critical aspect of drug discovery and development. While various artificial intelligence (AI)-based tools have been proposed to analyze drug biomedical associations (DBAs), their feature encoding did not adequately account for crucial biomedical functions and semantic concepts, thereby still hindering their progress. Since the advent of ChatGPT by OpenAI in 2022, large language models (LLMs) have demonstrated rapid growth and significant success across various applications. Herein, LEDAP was introduced, which uniquely leveraged LLM-based biotext feature encoding for predicting drug-disease associations, drug-drug interactions, and drug-side effect associations. Benefiting from the large-scale knowledgebase pre-training, LLMs had great potential in drug development analysis owing to their holistic understanding of natural language and human topics. LEDAP illustrated its notable competitiveness in comparison with other popular DBA analysis tools. Specifically, even in simple conjunction with classical machine learning methods, LLM-based feature representations consistently enabled satisfactory performance across diverse DBA tasks like binary classification, multiclass classification, and regression. Our findings underpinned the considerable potential of LLMs in drug development research, indicating a catalyst for further progress in related fields.

4.
Research (Wash D C) ; 7: 0408, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39055686

RESUMEN

Protein loop modeling is a challenging yet highly nontrivial task in protein structure prediction. Despite recent progress, existing methods including knowledge-based, ab initio, hybrid, and deep learning (DL) methods fall substantially short of either atomic accuracy or computational efficiency. To overcome these limitations, we present KarmaLoop, a novel paradigm that distinguishes itself as the first DL method centered on full-atom (encompassing both backbone and side-chain heavy atoms) protein loop modeling. Our results demonstrate that KarmaLoop considerably outperforms conventional and DL-based methods of loop modeling in terms of both accuracy and efficiency, with the average RMSDs of 1.77 and 1.95 Å for the CASP13+14 and CASP15 benchmark datasets, respectively, and manifests at least 2 orders of magnitude speedup in general compared with other methods. Consequently, our comprehensive evaluations indicate that KarmaLoop provides a state-of-the-art DL solution for protein loop modeling, with the potential to hasten the advancement of protein engineering, antibody-antigen recognition, and drug design.

5.
Nat Commun ; 15(1): 6404, 2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39080274

RESUMEN

Retrosynthesis is a crucial task in drug discovery and organic synthesis, where artificial intelligence (AI) is increasingly employed to expedite the process. However, existing approaches employ token-by-token decoding methods to translate target molecule strings into corresponding precursors, exhibiting unsatisfactory performance and limited diversity. As chemical reactions typically induce local molecular changes, reactants and products often overlap significantly. Inspired by this fact, we propose reframing single-step retrosynthesis prediction as a molecular string editing task, iteratively refining target molecule strings to generate precursor compounds. Our proposed approach involves a fragment-based generative editing model that uses explicit sequence editing operations. Additionally, we design an inference module with reposition sampling and sequence augmentation to enhance both prediction accuracy and diversity. Extensive experiments demonstrate that our model generates high-quality and diverse results, achieving superior performance with a promising top-1 accuracy of 60.8% on the standard benchmark dataset USPTO-50 K.

6.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38960407

RESUMEN

The optimization of therapeutic antibodies through traditional techniques, such as candidate screening via hybridoma or phage display, is resource-intensive and time-consuming. In recent years, computational and artificial intelligence-based methods have been actively developed to accelerate and improve the development of therapeutic antibodies. In this study, we developed an end-to-end sequence-based deep learning model, termed AttABseq, for the predictions of the antigen-antibody binding affinity changes connected with antibody mutations. AttABseq is a highly efficient and generic attention-based model by utilizing diverse antigen-antibody complex sequences as the input to predict the binding affinity changes of residue mutations. The assessment on the three benchmark datasets illustrates that AttABseq is 120% more accurate than other sequence-based models in terms of the Pearson correlation coefficient between the predicted and experimental binding affinity changes. Moreover, AttABseq also either outperforms or competes favorably with the structure-based approaches. Furthermore, AttABseq consistently demonstrates robust predictive capabilities across a diverse array of conditions, underscoring its remarkable capacity for generalization across a wide spectrum of antigen-antibody complexes. It imposes no constraints on the quantity of altered residues, rendering it particularly applicable in scenarios where crystallographic structures remain unavailable. The attention-based interpretability analysis indicates that the causal effects of point mutations on antibody-antigen binding affinity changes can be visualized at the residue level, which might assist automated antibody sequence optimization. We believe that AttABseq provides a fiercely competitive answer to therapeutic antibody optimization.


Asunto(s)
Complejo Antígeno-Anticuerpo , Aprendizaje Profundo , Complejo Antígeno-Anticuerpo/química , Antígenos/química , Antígenos/genética , Antígenos/metabolismo , Antígenos/inmunología , Afinidad de Anticuerpos , Secuencia de Aminoácidos , Biología Computacional/métodos , Humanos , Mutación , Anticuerpos/química , Anticuerpos/inmunología , Anticuerpos/genética , Anticuerpos/metabolismo
7.
Artículo en Inglés | MEDLINE | ID: mdl-39073712

RESUMEN

INTRODUCTION: Knowing the remission duration after biologics discontinuation in patients with psoriasis is important, especially when disease relapse is defined as the restart of systemic agents, because it also reflects the real-world clinical practice when topical treatment alone is not adequate for disease control, and a systemic treatment, including biologic, is needed. Biologics are currently indicated for patients with psoriasis who are candidates for systemic treatments. METHODS: We included 42 patients who were followed up with regularly after the end of risankizumab, guselkumab and mirikizumab trials and investigated the drug-free remission (DFR). A Kaplan-Meier survival analysis and Cox regression model were employed to identify the possible risk factors for relapse. RESULTS: Overall, 38/42 (90.5%) patients experienced relapses after discontinuing trial biologics during the follow-up period of at least 96 weeks and up to 227 weeks. In all patients with relapse, the median DFR was 104 days. Kaplan-Meier survival analysis revealed a significant 1-year drug-free survival (DFS) difference between risankizumab (Z) and guselkumab (T) + mirikizumab (M) (p = 0.0462). A difference in DFS curves was noted when patients were categorized by disease duration > or ≤ 2 years (p = 0.1577) and maintenance of a psoriasis area and severity index score (PASI) of 90 at the end of trials (p = 0.1177). Univariate Cox regression model identified that age [hazard ratio (HR) = 1.030 (1.000-1.060), p = 0.0467] and disease duration [HR = 1.046(1.009-1.084), p = 0.0134] were significantly associated with relapse risk. A risk model was established on the basis of multivariable Cox regression results. Risk value = 0.021038 * Age + 0.515628 * Biologic_type (Z = 0,T/M = 1) + 0.025048 * Disease_Duration. The validated patients were divided into two groups by median risk value (1.5). The high-risk group (risk value > 1.5) had a non-significant higher relapse risk than the low-risk group (risk value < 1.5), with a hazard ratio of 1.62 [95% confidence interval (CI) = 0.82-3.23, p = 0.1809]. CONCLUSION: Types of biologics used, disease duration > or ≤ 2 years, and PASI 90 improvement at the end of trial affect the 1-year DFS after biologics discontinuation. Further studies consisting of a larger patient number and longer follow-up period are needed to verify our findings. TRIAL REGISTRATION: ClinicalTrials.gov identifiers NCT02694523, NCT03047395, NCT02207224, NCT02576431, NCT03482011, and NCT03556202.

8.
J Chem Inf Model ; 64(14): 5381-5391, 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-38920405

RESUMEN

Artificial intelligence (AI)-aided drug design has demonstrated unprecedented effects on modern drug discovery, but there is still an urgent need for user-friendly interfaces that bridge the gap between these sophisticated tools and scientists, particularly those who are less computer savvy. Herein, we present DrugFlow, an AI-driven one-stop platform that offers a clean, convenient, and cloud-based interface to streamline early drug discovery workflows. By seamlessly integrating a range of innovative AI algorithms, covering molecular docking, quantitative structure-activity relationship modeling, molecular generation, ADMET (absorption, distribution, metabolism, excretion and toxicity) prediction, and virtual screening, DrugFlow can offer effective AI solutions for almost all crucial stages in early drug discovery, including hit identification and hit/lead optimization. We hope that the platform can provide sufficiently valuable guidance to aid real-word drug design and discovery. The platform is available at https://drugflow.com.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Simulación del Acoplamiento Molecular , Relación Estructura-Actividad Cuantitativa , Algoritmos , Diseño de Fármacos , Programas Informáticos , Humanos , Nube Computacional
11.
Taiwan J Obstet Gynecol ; 63(3): 405-408, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38802208

RESUMEN

OBJECTIVE: Impetigo herpetiformis (IH) is a rare form of pustular psoriasis which may result in maternal and fetal morbidity and even mortality. Deficiency of interleukin-36 receptor antagonist (DITRA) is the most frequently identified genetic defect of IH. Currently there are no biologics approved for IH despite the revolutionary role of biologics in the treatment of plaque and pustular psoriasis. Anecdotal reports of biologics use in DITRA patients with IH are also limited. CASE REPORTS: We present herein a case series of 6 Chinese IH patients harboring IL36RN gene c.115+6T>C mutation during 8 pregnancies, treated with various biologics, including adalimumab, etanercept and secukinumab. CONCLUSION: Most pregnancy courses were uneventful, except for one woman who had recurrent episodes of decreased fetal heart rate variability after adalimumab injections, which subsided after switching to etanercept. The treatment effectiveness and safety demonstrated in our cases suggested the role of biologics for the treatment of IH in patients with DITRA.


Asunto(s)
Adalimumab , Anticuerpos Monoclonales Humanizados , Etanercept , Complicaciones del Embarazo , Psoriasis , Humanos , Femenino , Embarazo , Adulto , Anticuerpos Monoclonales Humanizados/uso terapéutico , Etanercept/uso terapéutico , Adalimumab/uso terapéutico , Complicaciones del Embarazo/tratamiento farmacológico , Psoriasis/tratamiento farmacológico , Psoriasis/genética , Anticuerpos Monoclonales/uso terapéutico , Interleucinas/genética , Productos Biológicos/uso terapéutico , China , Mutación , Pueblos del Este de Asia
12.
Acc Chem Res ; 57(10): 1500-1509, 2024 05 21.
Artículo en Inglés | MEDLINE | ID: mdl-38577892

RESUMEN

Molecular docking, also termed ligand docking (LD), is a pivotal element of structure-based virtual screening (SBVS) used to predict the binding conformations and affinities of protein-ligand complexes. Traditional LD methodologies rely on a search and scoring framework, utilizing heuristic algorithms to explore binding conformations and scoring functions to evaluate binding strengths. However, to meet the efficiency demands of SBVS, these algorithms and functions are often simplified, prioritizing speed over accuracy.The emergence of deep learning (DL) has exerted a profound impact on diverse fields, ranging from natural language processing to computer vision and drug discovery. DeepMind's AlphaFold2 has impressively exhibited its ability to accurately predict protein structures solely from amino acid sequences, highlighting the remarkable potential of DL in conformation prediction. This groundbreaking advancement circumvents the traditional search-scoring frameworks in LD, enhancing both accuracy and processing speed and thereby catalyzing a broader adoption of DL algorithms in binding pose prediction. Nevertheless, a consensus on certain aspects remains elusive.In this Account, we delineate the current status of employing DL to augment LD within the VS paradigm, highlighting our contributions to this domain. Furthermore, we discuss the challenges and future prospects, drawing insights from our scholarly investigations. Initially, we present an overview of VS and LD, followed by an introduction to DL paradigms, which deviate significantly from traditional search-scoring frameworks. Subsequently, we delve into the challenges associated with the development of DL-based LD (DLLD), encompassing evaluation metrics, application scenarios, and physical plausibility of the predicted conformations. In the evaluation of LD algorithms, it is essential to recognize the multifaceted nature of the metrics. While the accuracy of binding pose prediction, often measured by the success rate, is a pivotal aspect, the scoring/screening power and computational speed of these algorithms are equally important given the pivotal role of LD tools in VS. Regarding application scenarios, early methods focused on blind docking, where the binding site is unknown. However, recent studies suggest a shift toward identifying binding sites rather than solely predicting binding poses within these models. In contrast, LD with a known pocket in VS has been shown to be more practical. Physical plausibility poses another significant challenge. Although DLLD models often achieve higher success rates compared to traditional methods, they may generate poses with implausible local structures, such as incorrect bond angles or lengths, which are disadvantageous for postprocessing tasks like visualization. Finally, we discuss the future perspectives for DLLD, emphasizing the need to improve generalization ability, strike a balance between speed and accuracy, account for protein conformation flexibility, and enhance physical plausibility. Additionally, we delve into the comparison between generative and regression algorithms in this context, exploring their respective strengths and potential.


Asunto(s)
Aprendizaje Profundo , Simulación del Acoplamiento Molecular , Ligandos , Proteínas/química , Proteínas/metabolismo , Algoritmos , Descubrimiento de Drogas
13.
J Cheminform ; 16(1): 38, 2024 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-38556873

RESUMEN

Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.

14.
Ital J Dermatol Venerol ; 159(2): 207-208, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38436614
15.
Exp Dermatol ; 33(3): e15056, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38488485

RESUMEN

Several studies have suggested that mutation of the interleukin 36 receptor antagonist gene (IL36RN) is related to generalized pustular psoriasis (GPP), and the presence of IL36RN mutation may affect the clinical manifestations and treatment responses. However, genetic testing is not routinely available in clinical practice for the diagnosis of GPP. Previously, GPP patients with acrodermatitis continua of Hallopeau (ACH) were found to have a high percentage of carrying IL36RN mutation. In this study, we reported six patients with pustular psoriasis presenting as diffuse palmoplantar erythema with keratoderma among 60 patients who carried IL36RN mutation. ACH was present in five patients and five patients had acute flare of GPP. This unique presentation may serve as a predictor for IL36RN mutation in patients with pustular psoriasis, similar to ACH.


Asunto(s)
Psoriasis , Humanos , Psoriasis/genética , Mutación , Eritema , China , Interleucinas/genética
16.
Phys Chem Chem Phys ; 26(13): 10323-10335, 2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38501198

RESUMEN

Ribonucleic acid (RNA)-ligand interactions play a pivotal role in a wide spectrum of biological processes, ranging from protein biosynthesis to cellular reproduction. This recognition has prompted the broader acceptance of RNA as a viable candidate for drug targets. Delving into the atomic-scale understanding of RNA-ligand interactions holds paramount importance in unraveling intricate molecular mechanisms and further contributing to RNA-based drug discovery. Computational approaches, particularly molecular docking, offer an efficient way of predicting the interactions between RNA and small molecules. However, the accuracy and reliability of these predictions heavily depend on the performance of scoring functions (SFs). In contrast to the majority of SFs used in RNA-ligand docking, the end-point binding free energy calculation methods, such as molecular mechanics/generalized Born surface area (MM/GBSA) and molecular mechanics/Poisson Boltzmann surface area (MM/PBSA), stand as theoretically more rigorous approaches. Yet, the evaluation of their effectiveness in predicting both binding affinities and binding poses within RNA-ligand systems remains unexplored. This study first reported the performance of MM/PBSA and MM/GBSA with diverse solvation models, interior dielectric constants (εin) and force fields in the context of binding affinity prediction for 29 RNA-ligand complexes. MM/GBSA is based on short (5 ns) molecular dynamics (MD) simulations in an explicit solvent with the YIL force field; the GBGBn2 model with higher interior dielectric constant (εin = 12, 16 or 20) yields the best correlation (Rp = -0.513), which outperforms the best correlation (Rp = -0.317, rDock) offered by various docking programs. Then, the efficacy of MM/GBSA in identifying the near-native binding poses from the decoys was assessed based on 56 RNA-ligand complexes. However, it is evident that MM/GBSA has limitations in accurately predicting binding poses for RNA-ligand systems, particularly compared with notably proficient docking programs like rDock and PLANTS. The best top-1 success rate achieved by MM/GBSA rescoring is 39.3%, which falls below the best results given by docking programs (50%, PLNATS). This study represents the first evaluation of MM/PBSA and MM/GBSA for RNA-ligand systems and is expected to provide valuable insights into their successful application to RNA targets.


Asunto(s)
Simulación de Dinámica Molecular , ARN , Simulación del Acoplamiento Molecular , Ligandos , Reproducibilidad de los Resultados , Unión Proteica , Termodinámica , Sitios de Unión
17.
J Chem Inf Model ; 64(4): 1213-1228, 2024 02 26.
Artículo en Inglés | MEDLINE | ID: mdl-38302422

RESUMEN

Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.


Asunto(s)
Diseño de Fármacos , ARN Viral , Ligandos , Algoritmos , Descubrimiento de Drogas
18.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38340091

RESUMEN

Discovering effective anti-tumor drug combinations is crucial for advancing cancer therapy. Taking full account of intricate biological interactions is highly important in accurately predicting drug synergy. However, the extremely limited prior knowledge poses great challenges in developing current computational methods. To address this, we introduce SynergyX, a multi-modality mutual attention network to improve anti-tumor drug synergy prediction. It dynamically captures cross-modal interactions, allowing for the modeling of complex biological networks and drug interactions. A convolution-augmented attention structure is adopted to integrate multi-omic data in this framework effectively. Compared with other state-of-the-art models, SynergyX demonstrates superior predictive accuracy in both the General Test and Blind Test and cross-dataset validation. By exhaustively screening combinations of approved drugs, SynergyX reveals its ability to identify promising drug combination candidates for potential lung cancer treatment. Another notable advantage lies in its multidimensional interpretability. Taking Sorafenib and Vorinostat as an example, SynergyX serves as a powerful tool for uncovering drug-gene interactions and deciphering cell selectivity mechanisms. In summary, SynergyX provides an illuminating and interpretable framework, poised to catalyze the expedition of drug synergy discovery and deepen our comprehension of rational combination therapy.


Asunto(s)
Descubrimiento de Drogas , Neoplasias Pulmonares , Humanos , Catálisis , Terapia Combinada , Proyectos de Investigación
19.
J Chem Theory Comput ; 20(3): 1465-1478, 2024 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-38300792

RESUMEN

Multisite λ-dynamics (MSLD) is a highly efficient binding free energy calculation method that samples multiple ligands in a single round by assigning different λ values to the alchemical part of each ligand. This method holds great promise for lead optimization (LO) in drug discovery. However, the complex data preparation and simulation process limits its widespread application in diverse protein-ligand systems. To address this challenge, we developed a comprehensive, open-source, and automated workflow for MSLD calculations based on the BLaDE dynamics engine. This workflow incorporates the Ligand Internal and Cartesian coordinate reconstruction-based alignment algorithm (LIC-align) and an optimized maximum common substructure (MCS) search algorithm to accurately generate MSLD multiple topologies with ideal perturbation patterns. Furthermore, our workflow is highly modularized, allowing straightforward integration and extension of various simulation techniques, and is highly accessible to nonexperts. This workflow was validated by calculating the relative binding free energies of large-scale congeneric ligands, many of which have large perturbing groups. The agreement between the calculations and experiments was excellent, with an average unsigned error of 1.08 ± 0.47 kcal/mol. More than 57.1% of the ligands had an error of less than 1.0 kcal/mol, and the perturbations of 6 targets were fully connected via the calculations, while those of 2 targets were connected via both calculations and experimental data. The Pearson correlation coefficient reached 0.88, indicating that the MSLD workflow provides accurate predictions that can guide lead optimization in drug discovery. We also examined the impact of single-site versus multisite perturbations, ligand grouping by perturbing group size, and the position of the anchor atom on the MSLD performance. By integrating our proposed LIC-align and optimized MCS search algorithm along with the coping strategies to handle challenging molecular substructures, our workflow can handle many realistic scenarios more reasonably than all previously published methods. Moreover, we observed that our MSLD workflow achieved similar accuracy to free energy perturbation (FEP) while improving computational efficiency by over 1 order of magnitude in speedup. These findings provide valuable insights and strategies for further MSLD development, making MSLD a competitive tool for lead optimization.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Termodinámica , Ligandos , Flujo de Trabajo , Proteínas/química , Unión Proteica
20.
Research (Wash D C) ; 7: 0292, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38213662

RESUMEN

Deep learning (DL)-driven efficient synthesis planning may profoundly transform the paradigm for designing novel pharmaceuticals and materials. However, the progress of many DL-assisted synthesis planning (DASP) algorithms has suffered from the lack of reliable automated pathway evaluation tools. As a critical metric for evaluating chemical reactions, accurate prediction of reaction yields helps improve the practicality of DASP algorithms in the real-world scenarios. Currently, accurately predicting yields of interesting reactions still faces numerous challenges, mainly including the absence of high-quality generic reaction yield datasets and robust generic yield predictors. To compensate for the limitations of high-throughput yield datasets, we curated a generic reaction yield dataset containing 12 reaction categories and rich reaction condition information. Subsequently, by utilizing 2 pretraining tasks based on chemical reaction masked language modeling and contrastive learning, we proposed a powerful bidirectional encoder representations from transformers (BERT)-based reaction yield predictor named Egret. It achieved comparable or even superior performance to the best previous models on 4 benchmark datasets and established state-of-the-art performance on the newly curated dataset. We found that reaction-condition-based contrastive learning enhances the model's sensitivity to reaction conditions, and Egret is capable of capturing subtle differences between reactions involving identical reactants and products but different reaction conditions. Furthermore, we proposed a new scoring function that incorporated Egret into the evaluation of multistep synthesis routes. Test results showed that yield-incorporated scoring facilitated the prioritization of literature-supported high-yield reaction pathways for target molecules. In addition, through meta-learning strategy, we further improved the reliability of the model's prediction for reaction types with limited data and lower data quality. Our results suggest that Egret holds the potential to become an essential component of the next-generation DASP tools.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...