Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 174
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38385872

RESUMO

Drug discovery and development constitute a laborious and costly undertaking. The success of a drug hinges not only good efficacy but also acceptable absorption, distribution, metabolism, elimination, and toxicity (ADMET) properties. Overall, up to 50% of drug development failures have been contributed from undesirable ADMET profiles. As a multiple parameter objective, the optimization of the ADMET properties is extremely challenging owing to the vast chemical space and limited human expert knowledge. In this study, a freely available platform called Chemical Molecular Optimization, Representation and Translation (ChemMORT) is developed for the optimization of multiple ADMET endpoints without the loss of potency (https://cadd.nscc-tj.cn/deploy/chemmort/). ChemMORT contains three modules: Simplified Molecular Input Line Entry System (SMILES) Encoder, Descriptor Decoder and Molecular Optimizer. The SMILES Encoder can generate the molecular representation with a 512-dimensional vector, and the Descriptor Decoder is able to translate the above representation to the corresponding molecular structure with high accuracy. Based on reversible molecular representation and particle swarm optimization strategy, the Molecular Optimizer can be used to effectively optimize undesirable ADMET properties without the loss of bioactivity, which essentially accomplishes the design of inverse QSAR. The constrained multi-objective optimization of the poly (ADP-ribose) polymerase-1 inhibitor is provided as the case to explore the utility of ChemMORT.


Assuntos
Aprendizado Profundo , Humanos , Desenvolvimento de Medicamentos , Descoberta de Drogas , Inibidores de Poli(ADP-Ribose) Polimerases
2.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36642412

RESUMO

Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.


Assuntos
Proteínas , Proteínas/metabolismo , Bases de Dados Factuais , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica
3.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32496540

RESUMO

Scoring functions (SFs) based on complex machine learning (ML) algorithms have gradually emerged as a promising alternative to overcome the weaknesses of classical SFs. However, extensive efforts have been devoted to the development of SFs based on new protein-ligand interaction representations and advanced alternative ML algorithms instead of the energy components obtained by the decomposition of existing SFs. Here, we propose a new method named energy auxiliary terms learning (EATL), in which the scoring components are extracted and used as the input for the development of three levels of ML SFs including EATL SFs, docking-EATL SFs and comprehensive SFs with ascending VS performance. The EATL approach not only outperforms classical SFs for the absolute performance (ROC) and initial enrichment (BEDROC) but also yields comparable performance compared with other advanced ML-based methods on the diverse subset of Directory of Useful Decoys: Enhanced (DUD-E). The test on the relatively unbiased actives as decoys (AD) dataset also proved the effectiveness of EATL. Furthermore, the idea of learning from SF components to yield improved screening power can also be extended to other docking programs and SFs available.


Assuntos
Descoberta de Drogas , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Proteínas/química , Ligação Proteica
4.
Brief Bioinform ; 22(1): 474-484, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31885044

RESUMO

BACKGROUND: With the increasing development of biotechnology and information technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these resources needs to be extracted and then transformed to useful knowledge by various data mining methods. However, a main computational challenge is how to effectively represent or encode molecular objects under investigation such as chemicals, proteins, DNAs and even complicated interactions when data mining methods are employed. To further explore these complicated data, an integrated toolkit to represent different types of molecular objects and support various data mining algorithms is urgently needed. RESULTS: We developed a freely available R/CRAN package, called BioMedR, for molecular representations of chemicals, proteins, DNAs and pairwise samples of their interactions. The current version of BioMedR could calculate 293 molecular descriptors and 13 kinds of molecular fingerprints for small molecules, 9920 protein descriptors based on protein sequences and six types of generalized scale-based descriptors for proteochemometric modeling, more than 6000 DNA descriptors from nucleotide sequences and six types of interaction descriptors using three different combining strategies. Moreover, this package realized five similarity calculation methods and four powerful clustering algorithms as well as several useful auxiliary tools, which aims at building an integrated analysis pipeline for data acquisition, data checking, descriptor calculation and data modeling. CONCLUSION: BioMedR provides a comprehensive and uniform R package to link up different representations of molecular objects with each other and will benefit cheminformatics/bioinformatics and other biomedical users. It is available at: https://CRAN.R-project.org/package=BioMedR and https://github.com/wind22zhu/BioMedR/.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Gerenciamento de Dados/métodos , Bases de Dados de Compostos Químicos , Bases de Dados Genéticas , Humanos
5.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32892221

RESUMO

BACKGROUND: High-throughput screening (HTS) and virtual screening (VS) have been widely used to identify potential hits from large chemical libraries. However, the frequent occurrence of 'noisy compounds' in the screened libraries, such as compounds with poor drug-likeness, poor selectivity or potential toxicity, has greatly weakened the enrichment capability of HTS and VS campaigns. Therefore, the development of comprehensive and credible tools to detect noisy compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS: In this study, we developed a freely available integrated python library for negative design, called Scopy, which supports the functions of data preparation, calculation of descriptors, scaffolds and screening filters, and data visualization. The current version of Scopy can calculate 39 basic molecular properties, 3 comprehensive molecular evaluation scores, 2 types of molecular scaffolds, 6 types of substructure descriptors and 2 types of fingerprints. A number of important screening rules are also provided by Scopy, including 15 drug-likeness rules (13 drug-likeness rules and 2 building block rules), 8 frequent hitter rules (four assay interference substructure filters and four promiscuous compound substructure filters), and 11 toxicophore filters (five human-related toxicity substructure filters, three environment-related toxicity substructure filters and three comprehensive toxicity substructure filters). Moreover, this library supports four different visualization functions to help users to gain a better understanding of the screened data, including basic feature radar chart, feature-feature-related scatter diagram, functional group marker gram and cloud gram. CONCLUSION: Scopy provides a comprehensive Python package to filter out compounds with undesirable properties or substructures, which will benefit the design of high-quality chemical libraries for drug design and discovery. It is freely available at https://github.com/kotori-y/Scopy.


Assuntos
Bases de Dados de Produtos Farmacêuticos/estatística & dados numéricos , Desenho de Fármacos , Desenvolvimento de Medicamentos/métodos , Ensaios de Triagem em Larga Escala/métodos , Bibliotecas de Moléculas Pequenas , Produtos Biológicos/química , Biologia Computacional/métodos , Descoberta de Drogas/métodos , Estabilidade de Medicamentos , Humanos , Estrutura Molecular , Preparações Farmacêuticas/química , Reprodutibilidade dos Testes , Projetos de Pesquisa
6.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34427296

RESUMO

Computational methods have become indispensable tools to accelerate the drug discovery process and alleviate the excessive dependence on time-consuming and labor-intensive experiments. Traditional feature-engineering approaches heavily rely on expert knowledge to devise useful features, which could be costly and sometimes biased. The emerging deep learning (DL) methods deliver a data-driven method to automatically learn expressive representations from complex raw data. Inspired by this, researchers have attempted to apply various deep neural network models to simplified molecular input line entry specification (SMILES) strings, which contain all the composition and structure information of molecules. However, current models usually suffer from the scarcity of labeled data. This results in a low generalization ability of SMILES-based DL models, which prevents them from competing with the state-of-the-art computational methods. In this study, we utilized the BiLSTM (bidirectional long short term merory) attention network (BAN) in which we employed a novel multi-step attention mechanism to facilitate the extracting of key features from the SMILES strings. Meanwhile, SMILES enumeration was utilized as a data augmentation method in the training phase to substantially increase the number of labeled data and enlarge the probability of mining more patterns from complex SMILES. We again took advantage of SMILES enumeration in the prediction phase to rectify model prediction bias and provide a more accurate prediction. Combined with the BAN model, our strategies can greatly improve the performance of latent features learned from SMILES strings. In 11 canonical absorption, distribution, metabolism, excretion and toxicity-related tasks, our method outperformed the state-of-the-art approaches.


Assuntos
Quimioinformática/métodos , Aprendizado Profundo , Descoberta de Drogas/métodos , Software , Algoritmos , Desenvolvimento de Medicamentos , Projetos de Pesquisa
7.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33418563

RESUMO

Matched molecular pairs analysis (MMPA) has become a powerful tool for automatically and systematically identifying medicinal chemistry transformations from compound/property datasets. However, accurate determination of matched molecular pair (MMP) transformations largely depend on the size and quality of existing experimental data. Lack of high-quality experimental data heavily hampers the extraction of more effective medicinal chemistry knowledge. Here, we developed a new strategy called quantitative structure-activity relationship (QSAR)-assisted-MMPA to expand the number of chemical transformations and took the logD7.4 property endpoint as an example to demonstrate the reliability of the new method. A reliable logD7.4 consensus prediction model was firstly established, and its applicability domain was strictly assessed. By applying the reliable logD7.4 prediction model to screen two chemical databases, we obtained more high-quality logD7.4 data by defining a strict applicability domain threshold. Then, MMPA was performed on the predicted data and experimental data to derive more chemical rules. To validate the reliability of the chemical rules, we compared the magnitude and directionality of the property changes of the predicted rules with those of the measured rules. Then, we compared the novel chemical rules generated by our proposed approach with the published chemical rules, and found that the magnitude and directionality of the property changes were consistent, indicating that the proposed QSAR-assisted-MMPA approach has the potential to enrich the collection of rule types or even identify completely novel rules. Finally, we found that the number of the MMP rules derived from the experimental data could be amplified by the predicted data, which is helpful for us to analyze the medicinal chemical rules in local chemical environment. In summary, the proposed QSAR-assisted-MMPA approach could be regarded as a very promising strategy to expand the chemical transformation space for lead optimization, especially when no enough experimental data can support MMPA.


Assuntos
Técnicas de Química Sintética/métodos , Química Farmacêutica/métodos , Descoberta de Drogas/métodos , Drogas em Investigação/síntese química , Modelos Estatísticos , Biotransformação , Bases de Dados de Compostos Químicos , Conjuntos de Dados como Assunto , Descoberta de Drogas/estatística & dados numéricos , Drogas em Investigação/metabolismo , Humanos , Estrutura Molecular , Relação Quantitativa Estrutura-Atividade , Reprodutibilidade dos Testes
8.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33709154

RESUMO

BACKGROUND: Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed. RESULTS: In this study, PySmash, a user-friendly and powerful tool to generate different types of representative substructures, was developed. The current version of PySmash provides both a Python package and an individual executable program, which achieves ease of operation and pipeline integration. Three types of substructure generation algorithms, including circular, path-based and functional group-based algorithms, are provided. Users can conveniently customize their own requirements for substructure size, accuracy and coverage, statistical significance and parallel computation during execution. Besides, PySmash provides the function for external data screening. CONCLUSION: PySmash, a user-friendly and integrated tool for the automatic generation and implementation of representative substructures, is presented. Three screening examples, including toxicophore derivation, privileged motif detection and the integration of substructures with machine learning (ML) models, are provided to illustrate the utility of PySmash in safety profile evaluation, therapeutic activity exploration and molecular optimization, respectively. Its executable program and Python package are available at https://github.com/kotori-y/pySmash.


Assuntos
Biologia Computacional/métodos , Descoberta de Drogas/métodos , Aprendizado de Máquina , Software , Testes de Carcinogenicidade/métodos , Carcinógenos , Ensaios de Seleção de Medicamentos Antitumorais/métodos , Humanos
9.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33201188

RESUMO

BACKGROUND: Fluorescent detection methods are indispensable tools for chemical biology. However, the frequent appearance of potential fluorescent compound has greatly interfered with the recognition of compounds with genuine activity. Such fluorescence interference is especially difficult to identify as it is reproducible and possesses concentration-dependent characteristic. Therefore, the development of a credible screening tool to detect fluorescent compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS: In this study, we developed a webserver ChemFLuo for fluorescent compound detection, based on two large and high-quality training datasets containing 4906 blue and 8632 green fluorescent compounds. These molecules were used to construct a group of prediction models based on the combination of three machine learning algorithms and seven types of molecular representations. The best blue fluorescence prediction model achieved with balanced accuracy (BA) = 0.858 and area under the receiver operating characteristic curve (AUC) = 0.931 for the validation set, and BA = 0.823 and AUC = 0.903 for the test set. The best green fluorescence prediction model achieved the prediction accuracy with BA = 0.810 and AUC = 0.887 for the validation set, and BA = 0.771 and AUC = 0.852 for the test set. Besides prediction model, 22 blue and 16 green representative fluorescent substructures were summarized for the screening of potential fluorescent compounds. The comparison with other fluorescence detection tools and theapplication to external validation sets and large molecule libraries have demonstrated the reliability of prediction model for fluorescent compound detection. CONCLUSION: ChemFLuo is a public webserver to filter out compounds with undesirable fluorescent properties, which will benefit the design of high-quality chemical libraries for drug discovery. It is freely available at http://admet.scbdd.com/chemfluo/index/.


Assuntos
Descoberta de Drogas , Corantes Fluorescentes/química , Aprendizado de Máquina , Modelos Químicos , Bibliotecas de Moléculas Pequenas , Fluorescência
10.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33940596

RESUMO

The poly (ADP-ribose) polymerase-1 (PARP1) has been regarded as a vital target in recent years and PARP1 inhibitors can be used for ovarian and breast cancer therapies. However, it has been realized that most of PARP1 inhibitors have disadvantages of low solubility and permeability. Therefore, by discovering more molecules with novel frameworks, it would have greater opportunities to apply it into broader clinical fields and have a more profound significance. In the present study, multiple virtual screening (VS) methods had been employed to evaluate the screening efficiency of ligand-based, structure-based and data fusion methods on PARP1 target. The VS methods include 2D similarity screening, structure-activity relationship (SAR) models, docking and complex-based pharmacophore screening. Moreover, the sum rank, sum score and reciprocal rank were also adopted for data fusion methods. The evaluation results show that the similarity searching based on Torsion fingerprint, six SAR models, Glide docking and pharmacophore screening using Phase have excellent screening performance. The best data fusion method is the reciprocal rank, but the sum score also performs well in framework enrichment. In general, the ligand-based VS methods show better performance on PARP1 inhibitor screening. These findings confirmed that adding ligand-based methods to the early screening stage will greatly improve the screening efficiency, and be able to enrich more highly active PARP1 inhibitors with diverse structures.


Assuntos
Bases de Dados de Compostos Químicos , Simulação de Acoplamento Molecular , Poli(ADP-Ribose) Polimerase-1/antagonistas & inibidores , Inibidores de Poli(ADP-Ribose) Polimerases/química , Avaliação Pré-Clínica de Medicamentos , Humanos , Poli(ADP-Ribose) Polimerase-1/química , Relação Estrutura-Atividade
11.
J Chem Inf Model ; 63(1): 111-125, 2023 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-36472475

RESUMO

Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.


Assuntos
Aprendizado Profundo , Simulação por Computador , Aprendizado de Máquina , Algoritmos , Descoberta de Drogas
12.
J Chem Inf Model ; 63(8): 2345-2359, 2023 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-37000044

RESUMO

The n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4) is an indicator of lipophilicity, and it influences a wide variety of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties and druggability of compounds. In log D7.4 prediction, graph neural networks (GNNs) can uncover subtle structure-property relationships (SPRs) by automatically extracting features from molecular graphs that facilitate the learning of SPRs, but their performances are often limited by the small size of available datasets. Herein, we present a transfer learning strategy called pretraining on computational data and then fine-tuning on experimental data (PCFE) to fully exploit the predictive potential of GNNs. PCFE works by pretraining a GNN model on 1.71 million computational log D data (low-fidelity data) and then fine-tuning it on 19,155 experimental log D7.4 data (high-fidelity data). The experiments for three GNN architectures (graph convolutional network (GCN), graph attention network (GAT), and Attentive FP) demonstrated the effectiveness of PCFE in improving GNNs for log D7.4 predictions. Moreover, the optimal PCFE-trained GNN model (cx-Attentive FP, Rtest2 = 0.909) outperformed four excellent descriptor-based models (random forest (RF), gradient boosting (GB), support vector machine (SVM), and extreme gradient boosting (XGBoost)). The robustness of the cx-Attentive FP model was also confirmed by evaluating the models with different training data sizes and dataset splitting strategies. Therefore, we developed a webserver and defined the applicability domain for this model. The webserver (http://tools.scbdd.com/chemlogd/) provides free log D7.4 prediction services. In addition, the important descriptors for log D7.4 were detected by the Shapley additive explanations (SHAP) method, and the most relevant substructures of log D7.4 were identified by the attention mechanism. Finally, the matched molecular pair analysis (MMPA) was performed to summarize the contributions of common chemical substituents to log D7.4, including a variety of hydrocarbon groups, halogen groups, heteroatoms, and polar groups. In conclusion, we believe that the cx-Attentive FP model can serve as a reliable tool to predict log D7.4 and hope that pretraining on low-fidelity data can help GNNs make accurate predictions of other endpoints in drug discovery.


Assuntos
Descoberta de Drogas , Halogênios , 1-Octanol , Aprendizagem , Redes Neurais de Computação
13.
Zhongguo Dang Dai Er Ke Za Zhi ; 23(9): 877-881, 2021.
Artigo em Inglês, Zh | MEDLINE | ID: mdl-34535200

RESUMO

OBJECTIVES: To study the efficacy of Huaiqihuang granules as adjuvant therapy for bronchial asthma in children. METHODS: A multicenter, prospective, and registered real-world study was performed for the children, aged 2-5 years, who had a confirmed diagnosis of bronchial asthma in the outpatient service of 21 hospitals in China. Among these children, the children treated with medications for long-term asthma control (inhaled corticosteroid and/or leukotriene receptor antagonist) without Huaiqihuang granules were enrolled as the control treatment group, and those treated with medications for long-term asthma control combined with Huaiqihuang granules were enrolled as the combined treatment group. The medical data of all children were collected. Outpatient or telephone follow-up was performed at weeks 4, 8, 12, 20, 28, and 36 after treatment, including asthma attacks and rhinitis symptoms. A statistical analysis was performed for the changes in these indices. RESULTS: There was no significant difference in the frequency of asthma attacks or rhinitis attacks between the two groups before treatment (P>0.05). After treatment, the combined treatment group had significantly lower frequencies of asthma attacks, severe asthma attacks, and rhinitis attacks compared with the control treatment group (P<0.05). There was no signification difference in the incidence rate of adverse reactions between the two groups (P=0.667). CONCLUSIONS: Huaiqihuang granules in addition to medications for long-term asthma control can alleviate the symptoms of bronchial asthma and rhinitis and improve the level of asthma control in children with bronchial asthma, with good safety and little adverse effect. Citation.


Assuntos
Asma , Medicamentos de Ervas Chinesas , Asma/tratamento farmacológico , Criança , Medicamentos de Ervas Chinesas/uso terapêutico , Humanos , Estudos Prospectivos , Qualidade de Vida
14.
J Chem Inf Model ; 60(9): 4216-4230, 2020 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-32352294

RESUMO

Virtual Screening (VS) based on molecular docking is an efficient method used for retrieving novel hit compounds in drug discovery. However, the accuracy of the current docking scoring function (SF) is usually insufficient. In this study, in order to improve the screening power of SF, a novel approach named EAT-Score was proposed by directly utilizing the energy auxiliary terms (EAT) provided by molecular docking scoring through eXtreme Gradient Boosting (XGBoost). Here, EAT specifically refers to the output of the Molecular Operating Environment (MOE) scoring, including the energy scores of five different classical SFs and the Protein-Ligand Interaction Fingerprint (PLIF) terms. The performance of EAT-Score to discriminate actives from decoys was strictly validated on the DUD-E diverse subset by using different performance metrics. The results showed that EAT-Score performed much better than classical SFs in VS, with its AUC values exhibiting an improvement of around 0.3. Meanwhile, EAT-Score could achieve comparable even better prediction performance compared with other state-of-the-art VS methods, such as some machine learning (ML)-based SFs and classical SFs implemented in docking programs, in terms of AUC, LogAUC, or BEDROC. Furthermore, the EAT-Score model can capture important binding pattern information from protein-ligand complexes by Shapley additive explanations (SHAP) analysis, which may be very helpful in interpreting the ligand binding mechanism for a certain target and thereby guiding drug design.


Assuntos
Aprendizado de Máquina , Proteínas , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/metabolismo
15.
J Chem Inf Model ; 60(4): 2031-2043, 2020 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-32202787

RESUMO

Luciferase-based bioluminescence detection techniques are highly favored in high-throughput screening (HTS), in which the firefly luciferase (FLuc) is the most commonly used variant. However, FLuc inhibitors can interfere with the activity of luciferase, which may result in false positive signals in HTS assays. In order to reduce the unnecessary cost of time and money, an in silico prediction model for FLuc inhibitors is highly desirable. In this study, we built an extensive data set consisting of 20 888 FLuc inhibitors and 198 608 noninhibitors, and then developed a group of classification models based on the combination of three machine learning (ML) algorithms and four types of molecular representations. The best prediction model based on XGBoost and ECFP4 and MOE2d descriptors yielded a balanced accuracy (BA) of 0.878 and an area under the receiver operating characteristic curve (AUC) value of 0.958 for the validation set, and a BA of 0.886 and an AUC of 0.947 for the test set. Three external validation sets, including set 1 (3231 FLuc inhibitors and 69 783 noninhibitors), set 2 (695 FLuc inhibitors and 75 913 noninhibitors), and set 3 (1138 FLuc inhibitors and 8155 noninhibitors), were used to verify the predictive ability of our models. The BA values for the three external validation sets given by the best model are 0.864, 0.845, and 0.791, respectively. In addition, the important features or structural fragments related to FLuc inhibitors were recognized by the Shapley additive explanations (SHAP) method along with their influences on predictions, which may provide valuable clues to detecting undesirable luciferase inhibitors. Based on the important and explanatory features, 16 rules were proposed for detecting FLuc inhibitors, which can achieve a correction rate of 70% for FLuc inhibitors. Furthermore, a comparison with existing prediction rules and models for FLuc inhibitors used in virtual screening verified the high reliability of the models and rules proposed in this study. We also used the model to screen three curated chemical databases, and almost 10% of the molecules in the evaluated databases were predicted as inhibitors, highlighting the potential risk of false positives in luciferase-based assays. Finally, a public web server called ChemFLuc was developed (http://admet.scbdd.com/chemfluc/index/), and it offers a free available service to predict potential FLuc inhibitors.


Assuntos
Bases de Dados de Compostos Químicos , Ensaios de Triagem em Larga Escala , Algoritmos , Luciferases , Reprodutibilidade dos Testes
16.
J Chem Inf Model ; 60(1): 63-76, 2020 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-31869226

RESUMO

Lipophilicity, as evaluated by the n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4), is a major determinant of various absorption, distribution, metabolism, elimination, and toxicology (ADMET) parameters of drug candidates. In this study, we developed several quantitative structure-property relationship (QSPR) models to predict log D7.4 based on a large and structurally diverse data set. Eight popular machine learning algorithms were employed to build the prediction models with 43 molecular descriptors selected by a wrapper feature selection method. The results demonstrated that XGBoost yielded better prediction performance than any other single model (RT2 = 0.906 and RMSET = 0.395). Moreover, the consensus model from the top three models could continue to improve the prediction performance (RT2 = 0.922 and RMSET = 0.359). The robustness, reliability, and generalization ability of the models were strictly evaluated by the Y-randomization test and applicability domain analysis. Moreover, the group contribution model based on 110 atom types and the local models for different ionization states were also established and compared to the global models. The results demonstrated that the descriptor-based consensus model is superior to the group contribution method, and the local models have no advantage over the global models. Finally, matched molecular pair (MMP) analysis and descriptor importance analysis were performed to extract transformation rules and give some explanations related to log D7.4. In conclusion, we believe that the consensus model developed in this study can be used as a reliable and promising tool to evaluate log D7.4 in drug discovery.


Assuntos
Aprendizado de Máquina , Modelos Moleculares , Algoritmos , Descoberta de Drogas/métodos , Lipídeos/química , Relação Quantitativa Estrutura-Atividade
17.
Clin Gastroenterol Hepatol ; 17(7): 1303-1310.e18, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-29654915

RESUMO

BACKGROUND & AIMS: The Chinese herbal medicine, MaZiRenWan (MZRW), has been used for more than 2000 years to treat constipation, but it has not been tested in a randomized controlled trial. We performed a trial to evaluate the efficacy and safety of MZRW, compared with the stimulant laxative senna or placebo, for patients with functional constipation (FC). METHODS: We performed a double-blind, double-dummy, trial of 291 patients with FC based on Rome III criteria, seen at 8 clinics in Hong Kong from June 2013 through August 2015. Patients were observed for 2 weeks and then assigned randomly (1:1:1) to groups given MZRW (7.5 g, twice daily), senna (15 mg daily), or placebo for 8 weeks. Patients were then followed for 8 weeks and evaluated at baseline and weeks 4, 8 (end of treatment), and 16 (end of follow up). Participants recorded information on stool form and frequency, feeling of complete evacuation, and research medication taken. Data on individual bowel symptoms, global symptom improvement, and adverse events were collected. A complete response was defined as an increase ≥1 complete spontaneous bowel movement (CSBM)/week from baseline (the primary outcome). Secondary outcomes included response during the follow-up period, colonic transit, individual and global symptom assessments, quality of life measured with 36-item short form Chinese version, and adverse events. RESULTS: Although there was no statistically significant difference in proportions of patients with a complete response to MZRW (68%) vs. senna (57.7%) (P = .14) at week 8, there was a statistically significant difference vs. placebo (33.0%) (P < .005). At the 16-week timepoint (after the 8-week follow-up period), 47.4% of patients had a complete response to MZRW, 20.6% had a complete response to senna, and 17.5% had a complete response to placebo (P < .005 for MZRW vs. placebo). The group that received MZRW group also had significant increases in colonic transit and reduced severity of constipation, straining, incomplete evacuation, and global constipation symptoms compared with the groups that received placebo or senna in (P < .05 for all comparisons). CONCLUSIONS: In a randomized controlled trial of 291 patients with FC, we found MZRW to be well-tolerated and effective in increasing CSBM/week. MZRW did not appear to be more effective than senna and might be considered as an alternative to this drug. ClincialTrials.gov no: NCT01695850.


Assuntos
Constipação Intestinal/tratamento farmacológico , Defecação/efeitos dos fármacos , Medicamentos de Ervas Chinesas/uso terapêutico , Qualidade de Vida , Constipação Intestinal/fisiopatologia , Relação Dose-Resposta a Droga , Método Duplo-Cego , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Resultado do Tratamento
18.
J Chem Inf Model ; 59(9): 3714-3726, 2019 09 23.
Artigo em Inglês | MEDLINE | ID: mdl-31430151

RESUMO

Aggregation has been posing a great challenge in drug discovery. Current computational approaches aiming to filter out aggregated molecules based on their similarity to known aggregators, such as Aggregator Advisor, have low prediction accuracy, and therefore development of reliable in silico models to detect aggregators is highly desirable. In this study, we built a data set consisting of 12 119 aggregators and 24 172 drugs or drug candidates and then developed a group of classification models based on the combination of two ensemble learning approaches and five types of molecular representations. The best model yielded an accuracy of 0.950 and an area under the curve (AUC) value of 0.987 for the training set, and an accuracy of 0.937 and an AUC of 0.976 for the test set. The best model also gave reliable predictions to the external validation set with 5681 aggregators since 80% of molecules were predicted to be aggregators with a prediction probability higher than 0.9. More importantly, we explored the relationship between colloidal aggregation and molecular features, and generalized a set of simple rules to detect aggregators. Molecular features, such as log D, the number of hydroxyl groups, the number of aromatic carbons attached to a hydrogen atom, and the number of sulfur atoms in aromatic heterocycles, would be helpful to distinguish aggregators from nonaggregators. A comparison with numerous existing druglikeness and aggregation filtering rules and models used in virtual screening verified the high reliability of the model and rules proposed in this study. We also used the model to screen several curated chemical databases, and almost 20% of molecules in the evaluated databases were predicted as aggregators, highlighting the potential high risk of aggregation in screening. Finally, we developed an online Web server of ChemAGG ( http://admet.scbdd.com/ChemAGG/index ), which offers a freely available tool to detect aggregators.


Assuntos
Descoberta de Drogas/métodos , Preparações Farmacêuticas/química , Simulação por Computador , Bases de Dados de Produtos Farmacêuticos , Desenho de Fármacos , Humanos , Estrutura Molecular , Software , Relação Estrutura-Atividade
19.
Molecules ; 23(7)2018 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-29958399

RESUMO

Polysaccharides, which exert immunoregulatory effects, are becoming more and more popular as food supplements; however, certain components of ordinary foods could be reducing the polysaccharides beneficial effects. Quercetin, a flavonoid found in common fruits and vegetables, is one such component. This study investigated the effects of quercetin on Astragalus polysaccharide RAP induced-macrophage activation. The results show quercetin decreases the NO production and iNOS gene expression in RAW264.7 cells, and it inhibits the production of cytokines in RAW264.7 cells and peritoneal macrophages. Western blot analysis results suggest that quercetin inhibits the phosphorylation of Akt/mTORC1, MAPKs, and TBK1, but has no effect on NF-κB in RAP-induced RAW264.7 cells. Taken together, the results show that quercetin partly inhibits macrophage activation by the Astragalus polysaccharide RAP. This study demonstrates that quercetin-containing foods may interfere with the immune-enhancing effects of Astragalus polysaccharide RAP to a certain extent.


Assuntos
Astrágalo/química , Polissacarídeos/farmacologia , Animais , Ativação de Macrófagos/efeitos dos fármacos , Camundongos , NF-kappa B/metabolismo , Fosforilação/efeitos dos fármacos , Quercetina/farmacologia , Células RAW 264.7 , Transdução de Sinais/efeitos dos fármacos
20.
BMC Bioinformatics ; 18(1): 165, 2017 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-28284192

RESUMO

BACKGROUND: Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power. RESULTS: Here we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone. In the case of aloe-emodin's laxative effect, MOST predicted that acetylcholinesterase was the mechanism-of-action target; in vivo studies validated this prediction. CONCLUSIONS: Using the MOST approach can result in highly accurate and robust target prediction. Integrated with a FDR control procedure, MOST provides a reliable framework for multiple-target inference. It has prospective applications in drug repurposing and mechanism-of-action target prediction.


Assuntos
Ligantes , Aprendizado de Máquina , Acetilcolinesterase/química , Acetilcolinesterase/metabolismo , Aloe/química , Aloe/metabolismo , Animais , Catárticos/química , Catárticos/metabolismo , Bases de Dados de Compostos Químicos , Emodina/química , Emodina/metabolismo , Humanos , Cinética , Modelos Logísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA