Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
Acta Pharmacol Sin ; 45(9): 1978-1991, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38750073

RESUMEN

Prostate cancer (PCa) is the second most prevalent malignancy among men worldwide. The aberrant activation of androgen receptor (AR) signaling has been recognized as a crucial oncogenic driver for PCa and AR antagonists are widely used in PCa therapy. To develop novel AR antagonist, a machine-learning MIEC-SVM model was established for the virtual screening and 51 candidates were selected and submitted for bioactivity evaluation. To our surprise, a new-scaffold AR antagonist C2 with comparable bioactivity with Enz was identified at the initial round of screening. C2 showed pronounced inhibition on the transcriptional function (IC50 = 0.63 µM) and nuclear translocation of AR and significant antiproliferative and antimetastatic activity on PCa cell line of LNCaP. In addition, C2 exhibited a stronger ability to block the cell cycle of LNCaP than Enz at lower dose and superior AR specificity. Our study highlights the success of MIEC-SVM in discovering AR antagonists, and compound C2 presents a promising new scaffold for the development of AR-targeted therapeutics.


Asunto(s)
Antagonistas de Receptores Androgénicos , Proliferación Celular , Neoplasias de la Próstata , Receptores Androgénicos , Humanos , Antagonistas de Receptores Androgénicos/farmacología , Antagonistas de Receptores Androgénicos/química , Receptores Androgénicos/metabolismo , Proliferación Celular/efectos de los fármacos , Masculino , Línea Celular Tumoral , Neoplasias de la Próstata/tratamiento farmacológico , Neoplasias de la Próstata/patología , Antineoplásicos/farmacología , Antineoplásicos/química , Aprendizaje Automático , Relación Estructura-Actividad , Ciclo Celular/efectos de los fármacos
2.
J Cheminform ; 15(1): 48, 2023 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-37088813

RESUMEN

Identification and validation of bioactive small-molecule targets is a significant challenge in drug discovery. In recent years, various in-silico approaches have been proposed to expedite time- and resource-consuming experiments for target detection. Herein, we developed several chemogenomic models for target prediction based on multi-scale information of chemical structures and protein sequences. By combining the information of a compound with multiple protein targets together and putting these compound-target pairs into a well-established model, the scores to indicate whether there are interactions between compounds and targets can be derived, and thus a target prediction task can be completed by sorting the outputted scores. To improve the prediction performance, we constructed several chemogenomic models using multi-scale information of chemical structures and protein sequences, and the ensemble model with the best performance was used as our final model. The model was validated by various strategies and external datasets and the promising target prediction capability of the model, i.e., the fraction of known targets identified in the top-k (1 to 10) list of the potential target candidates suggested by the model, was confirmed. Compared with multiple state-of-art target prediction methods, our model showed equivalent or better predictive ability in terms of the top-k predictions. It is expected that our method can be utilized as a powerful computational tool to narrow down the potential targets for experimental testing.

3.
J Chem Inf Model ; 63(8): 2345-2359, 2023 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-37000044

RESUMEN

The n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4) is an indicator of lipophilicity, and it influences a wide variety of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties and druggability of compounds. In log D7.4 prediction, graph neural networks (GNNs) can uncover subtle structure-property relationships (SPRs) by automatically extracting features from molecular graphs that facilitate the learning of SPRs, but their performances are often limited by the small size of available datasets. Herein, we present a transfer learning strategy called pretraining on computational data and then fine-tuning on experimental data (PCFE) to fully exploit the predictive potential of GNNs. PCFE works by pretraining a GNN model on 1.71 million computational log D data (low-fidelity data) and then fine-tuning it on 19,155 experimental log D7.4 data (high-fidelity data). The experiments for three GNN architectures (graph convolutional network (GCN), graph attention network (GAT), and Attentive FP) demonstrated the effectiveness of PCFE in improving GNNs for log D7.4 predictions. Moreover, the optimal PCFE-trained GNN model (cx-Attentive FP, Rtest2 = 0.909) outperformed four excellent descriptor-based models (random forest (RF), gradient boosting (GB), support vector machine (SVM), and extreme gradient boosting (XGBoost)). The robustness of the cx-Attentive FP model was also confirmed by evaluating the models with different training data sizes and dataset splitting strategies. Therefore, we developed a webserver and defined the applicability domain for this model. The webserver (http://tools.scbdd.com/chemlogd/) provides free log D7.4 prediction services. In addition, the important descriptors for log D7.4 were detected by the Shapley additive explanations (SHAP) method, and the most relevant substructures of log D7.4 were identified by the attention mechanism. Finally, the matched molecular pair analysis (MMPA) was performed to summarize the contributions of common chemical substituents to log D7.4, including a variety of hydrocarbon groups, halogen groups, heteroatoms, and polar groups. In conclusion, we believe that the cx-Attentive FP model can serve as a reliable tool to predict log D7.4 and hope that pretraining on low-fidelity data can help GNNs make accurate predictions of other endpoints in drug discovery.


Asunto(s)
Descubrimiento de Drogas , Halógenos , 1-Octanol , Aprendizaje , Redes Neurales de la Computación
4.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36642412

RESUMEN

Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.


Asunto(s)
Proteínas , Proteínas/metabolismo , Bases de Datos Factuales , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica
5.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36681902

RESUMEN

Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.


Asunto(s)
Algoritmos , Ligandos
6.
Acta Pharmacol Sin ; 44(7): 1500-1518, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-36639570

RESUMEN

As a major class of medicine for treating the lethal type of castration-resistant prostate cancer (PCa), long-term use of androgen receptor (AR) antagonists commonly leads to antiandrogen resistance. When AR signaling pathway is blocked by AR-targeted therapy, glucocorticoid receptor (GR) could compensate for AR function especially at the late stage of PCa. AR-GR dual antagonist is expected to be a good solution for this situation. Nevertheless, no effective non-steroidal AR-GR dual antagonist has been reported so far. In this study, an AR-GR dual binder H18 was first discovered by combining structure-based virtual screening and biological evaluation. Then with the aid of computationally guided design, the AR-GR dual antagonist HD57 was finally identified with antagonistic activity towards both AR (IC50 = 0.394 µM) and GR (IC50 = 17.81 µM). Moreover, HD57 could effectively antagonize various clinically relevant AR mutants. Further molecular dynamics simulation provided more atomic insights into the mode of action of HD57. Our research presents an efficient and rational strategy for discovering novel AR-GR dual antagonists, and the new scaffold provides important clues for the development of novel therapeutics for castration-resistant PCa.


Asunto(s)
Antagonistas de Andrógenos , Neoplasias de la Próstata , Masculino , Humanos , Antagonistas de Andrógenos/farmacología , Receptores de Glucocorticoides/metabolismo , Receptores Androgénicos/metabolismo , Antagonistas de Receptores Androgénicos/farmacología , Neoplasias de la Próstata/metabolismo , Línea Celular Tumoral
7.
J Chem Inf Model ; 63(1): 111-125, 2023 01 09.
Artículo en Inglés | MEDLINE | ID: mdl-36472475

RESUMEN

Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.


Asunto(s)
Aprendizaje Profundo , Simulación por Computador , Aprendizaje Automático , Algoritmos , Descubrimiento de Drogas
8.
Acta Pharmacol Sin ; 43(11): 2817-2827, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35501362

RESUMEN

Progressive ischemic stroke (PIS) is featured by progressive neurological dysfunction after ischemia. Ischemia-evoked neuroinflammation is implicated in the progressive brain injury after cerebral ischemia, while Caspase-1, an active component of inflammasome, exaggerates ischemic brain injury. Current Caspase-1 inhibitors are inadequate in safety and druggability. Here, we investigated the efficacy of CZL80, a novel Caspase-1 inhibitor, in mice with PIS. Mice and Caspase-1-/- mice were subjected to photothrombotic (PT)-induced cerebral ischemia. CZL80 (10, 30 mg·kg-1·d-1, i.p.) was administered for one week after PT onset. The transient and the progressive neurological dysfunction (as foot faults in the grid-walking task and forelimb symmetry in the cylinder task) was assessed on Day1 and Day4-7, respectively, after PT onset. Treatment with CZL80 (30 mg/kg) during Day1-7 significantly reduced the progressive, but not the transient neurological dysfunction. Furthermore, we showed that CZL80 administered on Day4-7, when the progressive neurological dysfunction occurred, produced significant beneficial effects against PIS, suggesting an extended therapeutic time-window. CZL80 administration could improve the neurological function even as late as Day43 after PT. In Caspase-1-/- mice with PIS, the beneficial effects of CZL80 were abolished. We found that Caspase-1 was upregulated during Day4-7 after PT and predominantly located in activated microglia, which was coincided with the progressive neurological deficits, and attenuated by CZL80. We showed that CZL80 administration did not reduce the infarct volume, but significantly suppressed microglia activation in the peri-infarct cortex, suggesting the involvement of microglial inflammasome in the pathology of PIS. Taken together, this study demonstrates that Caspase-1 is required for the progressive neurological dysfunction in PIS. CZL80 is a promising drug to promote the neurological recovery in PIS by inhibiting Caspase-1 within a long therapeutic time-window.


Asunto(s)
Lesiones Encefálicas , Isquemia Encefálica , Accidente Cerebrovascular Isquémico , Accidente Cerebrovascular , Ratones , Animales , Inflamasomas , Modelos Animales de Enfermedad , Isquemia Encefálica/tratamiento farmacológico , Isquemia Encefálica/patología , Microglía , Infarto Cerebral , Caspasa 1 , Lesiones Encefálicas/patología , Accidente Cerebrovascular/tratamiento farmacológico , Accidente Cerebrovascular/patología , Ratones Endogámicos C57BL
9.
J Cheminform ; 14(1): 23, 2022 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-35428354

RESUMEN

Drug-drug interaction (DDI) often causes serious adverse reactions and thus results in inestimable economic and social loss. Currently, comprehensive DDI evaluation has become a major challenge in pharmaceutical research due to the time-consuming and costly process of the experimental assessment and it is of high necessity to develop effective in silico methods to predict and evaluate DDIs accurately and efficiently. In this study, based on a large number of substrates and inhibitors related to five important CYP450 isozymes (CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4), a series of high-performance predictive models for metabolic DDIs were constructed by two machine learning methods (random forest and XGBoost) and 4 different types of descriptors (MOE_2D, CATS, ECFP4 and MACCS). To reduce the uncertainty of individual models, the consensus method was applied to yield more reliable predictions. A series of evaluations illustrated that the consensus models were more reliable and robust for the DDI predictions of new drug combination. For the internal validation, the whole prediction accuracy and AUC value of the DDI models were around 0.8 and 0.9, respectively. When it was applied to the external datasets, the model accuracy was 0.793 and 0.795 for multi-level validation and external validation, respectively. Furthermore, we also compared our model with some recently published tools and then applied the final model to predict FDA-approved drugs and proposed 54,013 possible drug pairs with potential DDIs. In summary, we developed a powerful DDI predictive model from the perspective of the CYP450 enzyme family and it will help a lot in the future drug development and clinical pharmacy research.

10.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35212357

RESUMEN

Structural information for chemical compounds is often described by pictorial images in most scientific documents, which cannot be easily understood and manipulated by computers. This dilemma makes optical chemical structure recognition (OCSR) an essential tool for automatically mining knowledge from an enormous amount of literature. However, existing OCSR methods fall far short of our expectations for realistic requirements due to their poor recovery accuracy. In this paper, we developed a deep neural network model named ABC-Net (Atom and Bond Center Network) to predict graph structures directly. Based on the divide-and-conquer principle, we propose to model an atom or a bond as a single point in the center. In this way, we can leverage a fully convolutional neural network (CNN) to generate a series of heat-maps to identify these points and predict relevant properties, such as atom types, atom charges, bond types and other properties. Thus, the molecular structure can be recovered by assembling the detected atoms and bonds. Our approach integrates all the detection and property prediction tasks into a single fully CNN, which is scalable and capable of processing molecular images quite efficiently. Experimental results demonstrate that our method could achieve a significant improvement in recognition performance compared with publicly available tools. The proposed method could be considered as a promising solution to OCSR problems and a starting point for the acquisition of molecular information in the literature.


Asunto(s)
Aprendizaje Profundo , Estructura Molecular , Redes Neurales de la Computación
11.
Acta Pharmacol Sin ; 43(9): 2429-2438, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35110698

RESUMEN

Synthetic glucocorticoids (GCs) have been widely used in the treatment of a broad range of inflammatory diseases, but their clinic use is limited by undesired side effects such as metabolic disorders, osteoporosis, skin and muscle atrophies, mood disorders and hypothalamic-pituitary-adrenal (HPA) axis suppression. Selective glucocorticoid receptor modulators (SGRMs) are expected to have promising anti-inflammatory efficacy but with fewer side effects caused by GCs. Here, we reported HT-15, a prospective SGRM discovered by structure-based virtual screening (VS) and bioassays. HT-15 can selectively act on the NF-κB/AP1-mediated transrepression function of glucocorticoid receptor (GR) and repress the expression of pro-inflammation cytokines (i.e., IL-1ß, IL-6, COX-2, and CCL-2) as effectively as dexamethasone (Dex). Compared with Dex, HT-15 shows less transactivation potency that is associated with the main adverse effects of synthetic GCs, and no cross activities with other nuclear receptors. Furthermore, HT-15 exhibits very weak inhibition on the ratio of OPG/RANKL. Therefore, it may reduce the side effects induced by normal GCs. The bioactive compound HT-15 can serve as a starting point for the development of novel therapeutics for high dose or long-term anti-inflammatory treatment.


Asunto(s)
Glucocorticoides , Receptores de Glucocorticoides , Antiinflamatorios/farmacología , Bioensayo , Glucocorticoides/farmacología , Estudios Prospectivos , Receptores de Glucocorticoides/metabolismo
12.
Acta Pharmacol Sin ; 43(6): 1508-1520, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-34429524

RESUMEN

Macrophage migration inhibitory factor (MIF) is a pluripotent pro-inflammatory cytokine and is related to acute and chronic inflammatory responses, immune disorders, tumors, and other diseases. In this study, an integrated virtual screening strategy and bioassays were used to search for potent MIF inhibitors. Twelve compounds with better bioactivity than the prototypical MIF-inhibitor ISO-1 (IC50 = 14.41 µM) were identified by an in vitro enzymatic activity assay. Structural analysis revealed that these inhibitors have novel structural scaffolds. Compound 11 was then chosen for further characterization in vitro, and it exhibited marked anti-inflammatory efficacy in LPS-activated BV-2 microglial cells by suppressing the activation of nuclear factor kappa B (NF-κB) and mitogen-activated protein kinases (MAPKs). Our findings suggest that MIF may be involved in the regulation of microglial inflammatory activation and that small-molecule MIF inhibitors may serve as promising therapeutic agents for neuroinflammatory diseases.


Asunto(s)
Factores Inhibidores de la Migración de Macrófagos , Antiinflamatorios/química , Bioensayo , Factores Inhibidores de la Migración de Macrófagos/metabolismo , Microglía/metabolismo , FN-kappa B/metabolismo
13.
Acta Pharmacol Sin ; 43(6): 1605-1615, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-34667293

RESUMEN

Decaprenylphosphoryl-ß-D-ribose oxidase (DprE1) plays important roles in the biosynthesis of mycobacterium cell wall. DprE1 inhibitors have shown great potentials in the development of new regimens for tuberculosis (TB) treatment. In this study, an integrated molecular modeling strategy, which combined computational bioactivity fingerprints and structure-based virtual screening, was employed to identify potential DprE1 inhibitors. Two lead compounds (B2 and H3) that could inhibit DprE1 and thus kill Mycobacterium smegmatis in vitro were identified. Moreover, compound H3 showed potent inhibitory activity against Mycobacterium tuberculosis in vitro (MICMtb = 1.25 µM) and low cytotoxicity against mouse embryo fibroblast NIH-3T3 cells. Our research provided an effective strategy to discover novel anti-TB lead compounds.


Asunto(s)
Antituberculosos , Mycobacterium tuberculosis , Animales , Antituberculosos/farmacología , Antituberculosos/uso terapéutico , Proteínas Bacterianas , Ratones , Modelos Moleculares
14.
Acta Pharmacol Sin ; 43(1): 229-239, 2022 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33767381

RESUMEN

Androgen receptor (AR), a ligand-activated transcription factor, is a master regulator in the development and progress of prostate cancer (PCa). A major challenge for the clinically used AR antagonists is the rapid emergence of resistance induced by the mutations at AR ligand binding domain (LBD), and therefore the discovery of novel anti-AR therapeutics that can combat mutation-induced resistance is quite demanding. Therein, blocking the interaction between AR and DNA represents an innovative strategy. However, the hits confirmed targeting on it so far are all structurally based on a sole chemical scaffold. In this study, an integrated docking-based virtual screening (VS) strategy based on the crystal structure of the DNA binding domain (DBD) of AR was conducted to search for novel AR antagonists with new scaffolds and 2-(2-butyl-1,3-dioxoisoindoline-5-carboxamido)-4,5-dimethoxybenzoicacid (Cpd39) was identified as a potential hit, which was competent to block the binding of AR DBD to DNA and showed decent potency against AR transcriptional activity. Furthermore, Cpd39 was safe and capable of effectively inhibiting the proliferation of PCa cell lines (i.e., LNCaP, PC3, DU145, and 22RV1) and reducing the expression of the genes regulated by not only the full-length AR but also the splice variant AR-V7. The novel AR DBD-ARE blocker Cpd39 could serve as a starting point for the development of new therapeutics for castration-resistant PCa.


Asunto(s)
Antagonistas de Receptores Androgénicos/farmacología , ADN/antagonistas & inhibidores , Descubrimiento de Drogas , Simulación del Acoplamiento Molecular , Receptores Androgénicos/metabolismo , Antagonistas de Receptores Androgénicos/química , Sitios de Unión/efectos de los fármacos , ADN/química , Relación Dosis-Respuesta a Droga , Evaluación Preclínica de Medicamentos , Humanos , Estructura Molecular , Receptores Androgénicos/química , Relación Estructura-Actividad
15.
J Cheminform ; 13(1): 86, 2021 Nov 13.
Artículo en Inglés | MEDLINE | ID: mdl-34774096

RESUMEN

In the process of drug discovery, the optimization of lead compounds has always been a challenge faced by pharmaceutical chemists. Matched molecular pair analysis (MMPA), a promising tool to efficiently extract and summarize the relationship between structural transformation and property change, is suitable for local structural optimization tasks. Especially, the integration of MMPA with QSAR modeling can further strengthen the utility of MMPA in molecular optimization navigation. In this study, a new semi-automated procedure based on KNIME was developed to support MMPA on both large- and small-scale datasets, including molecular preparation, QSAR model construction, applicability domain evaluation, and MMP calculation and application. Two examples covering regression and classification tasks were provided to gain a better understanding of the importance of MMPA, which has also shown the reliability and utility of this MMPA-by-QSAR pipeline.

16.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34427296

RESUMEN

Computational methods have become indispensable tools to accelerate the drug discovery process and alleviate the excessive dependence on time-consuming and labor-intensive experiments. Traditional feature-engineering approaches heavily rely on expert knowledge to devise useful features, which could be costly and sometimes biased. The emerging deep learning (DL) methods deliver a data-driven method to automatically learn expressive representations from complex raw data. Inspired by this, researchers have attempted to apply various deep neural network models to simplified molecular input line entry specification (SMILES) strings, which contain all the composition and structure information of molecules. However, current models usually suffer from the scarcity of labeled data. This results in a low generalization ability of SMILES-based DL models, which prevents them from competing with the state-of-the-art computational methods. In this study, we utilized the BiLSTM (bidirectional long short term merory) attention network (BAN) in which we employed a novel multi-step attention mechanism to facilitate the extracting of key features from the SMILES strings. Meanwhile, SMILES enumeration was utilized as a data augmentation method in the training phase to substantially increase the number of labeled data and enlarge the probability of mining more patterns from complex SMILES. We again took advantage of SMILES enumeration in the prediction phase to rectify model prediction bias and provide a more accurate prediction. Combined with the BAN model, our strategies can greatly improve the performance of latent features learned from SMILES strings. In 11 canonical absorption, distribution, metabolism, excretion and toxicity-related tasks, our method outperformed the state-of-the-art approaches.


Asunto(s)
Quimioinformática/métodos , Aprendizaje Profundo , Descubrimiento de Drogas/métodos , Programas Informáticos , Algoritmos , Desarrollo de Medicamentos , Proyectos de Investigación
17.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33951729

RESUMEN

MOTIVATION: Accurate and efficient prediction of molecular properties is one of the fundamental issues in drug design and discovery pipelines. Traditional feature engineering-based approaches require extensive expertise in the feature design and selection process. With the development of artificial intelligence (AI) technologies, data-driven methods exhibit unparalleled advantages over the feature engineering-based methods in various domains. Nevertheless, when applied to molecular property prediction, AI models usually suffer from the scarcity of labeled data and show poor generalization ability. RESULTS: In this study, we proposed molecular graph BERT (MG-BERT), which integrates the local message passing mechanism of graph neural networks (GNNs) into the powerful BERT model to facilitate learning from molecular graphs. Furthermore, an effective self-supervised learning strategy named masked atoms prediction was proposed to pretrain the MG-BERT model on a large amount of unlabeled data to mine context information in molecules. We found the MG-BERT model can generate context-sensitive atomic representations after pretraining and transfer the learned knowledge to the prediction of a variety of molecular properties. The experimental results show that the pretrained MG-BERT model with a little extra fine-tuning can consistently outperform the state-of-the-art methods on all 11 ADMET datasets. Moreover, the MG-BERT model leverages attention mechanisms to focus on atomic features essential to the target property, providing excellent interpretability for the trained model. The MG-BERT model does not require any hand-crafted feature as input and is more reliable due to its excellent interpretability, providing a novel framework to develop state-of-the-art models for a wide range of drug discovery tasks.


Asunto(s)
Modelos Teóricos , Redes Neurales de la Computación
18.
J Med Chem ; 64(11): 7544-7554, 2021 06 10.
Artículo en Inglés | MEDLINE | ID: mdl-34008979

RESUMEN

As one of the central tasks of modern medicinal chemistry, scaffold hopping is expected to lead to the discovery of structural novel biological active compounds and broaden the chemical space of known active compounds. Here, we report the computational bioactivity fingerprint (CBFP) for easier scaffold hopping, where the predicted activities in multiple quantitative structure-activity relationship models are integrated to characterize the biological space of a molecule. In retrospective benchmarks, the CBFP representation shows outstanding scaffold hopping potential relative to other chemical descriptors. In the prospective validation for the discovery of novel inhibitors of poly [ADP-ribose] polymerase 1, 35 predicted compounds with diverse structures are tested, 25 of which show detectable growth-inhibitory activity; beyond this, the most potent (compound 6) has an IC50 of 0.263 nM. These results support the use of CBFP representation as the bioactivity proxy of molecules to explore uncharted chemical space and discover novel compounds.


Asunto(s)
Descubrimiento de Drogas/métodos , Relación Estructura-Actividad Cuantitativa , Línea Celular , Proliferación Celular/efectos de los fármacos , Humanos , Inhibidores de Poli(ADP-Ribosa) Polimerasas/química , Inhibidores de Poli(ADP-Ribosa) Polimerasas/metabolismo , Inhibidores de Poli(ADP-Ribosa) Polimerasas/farmacología , Poli(ADP-Ribosa) Polimerasas/química , Poli(ADP-Ribosa) Polimerasas/metabolismo
19.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33709154

RESUMEN

BACKGROUND: Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed. RESULTS: In this study, PySmash, a user-friendly and powerful tool to generate different types of representative substructures, was developed. The current version of PySmash provides both a Python package and an individual executable program, which achieves ease of operation and pipeline integration. Three types of substructure generation algorithms, including circular, path-based and functional group-based algorithms, are provided. Users can conveniently customize their own requirements for substructure size, accuracy and coverage, statistical significance and parallel computation during execution. Besides, PySmash provides the function for external data screening. CONCLUSION: PySmash, a user-friendly and integrated tool for the automatic generation and implementation of representative substructures, is presented. Three screening examples, including toxicophore derivation, privileged motif detection and the integration of substructures with machine learning (ML) models, are provided to illustrate the utility of PySmash in safety profile evaluation, therapeutic activity exploration and molecular optimization, respectively. Its executable program and Python package are available at https://github.com/kotori-y/pySmash.


Asunto(s)
Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Programas Informáticos , Pruebas de Carcinogenicidad/métodos , Carcinógenos , Ensayos de Selección de Medicamentos Antitumorales/métodos , Humanos
20.
Drug Discov Today ; 26(6): 1353-1358, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33581116

RESUMEN

In 2010, the pan-assay interference compounds (PAINS) rule was proposed to identify false-positive compounds, especially frequent hitters (FHs), in biological screening campaigns, and has rapidly become an essential component in drug design. However, the specific mechanisms remain unknown, and the result validation and follow-up processing schemes are still unclear. In this review, a large benchmark collection of >600,000 compounds sourced from databases and the literature, including six common false-positive mechanisms, was used to evaluate the detection ability of PAINS. In addition, 400 million purchasable molecules from the ZINC database were also applied to PAINS screening. The results indicate that the PAINS rule is not suitable for the screening of all types of false-positive results and needs more improvement.


Asunto(s)
Bases de Datos Factuales , Diseño de Fármacos , Ensayos Analíticos de Alto Rendimiento/métodos , Benchmarking , Descubrimiento de Drogas/métodos , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA