Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39129365

RESUMO

Enzymatic reaction kinetics are central in analyzing enzymatic reaction mechanisms and target-enzyme optimization, and thus in biomanufacturing and other industries. The enzyme turnover number (kcat) and Michaelis constant (Km), key kinetic parameters for measuring enzyme catalytic efficiency, are crucial for analyzing enzymatic reaction mechanisms and the directed evolution of target enzymes. Experimental determination of kcat and Km is costly in terms of time, labor, and cost. To consider the intrinsic connection between kcat and Km and further improve the prediction performance, we propose a universal pretrained multitask deep learning model, MPEK, to predict these parameters simultaneously while considering pH, temperature, and organismal information. Through testing on the same kcat and Km test datasets, MPEK demonstrated superior prediction performance over the previous models. Specifically, MPEK achieved the Pearson coefficient of 0.808 for predicting kcat, improving ca. 14.6% and 7.6% compared to the DLKcat and UniKP models, and it achieved the Pearson coefficient of 0.777 for predicting Km, improving ca. 34.9% and 53.3% compared to the Kroll_model and UniKP models. More importantly, MPEK was able to reveal enzyme promiscuity and was sensitive to slight changes in the mutant enzyme sequence. In addition, in three case studies, it was shown that MPEK has the potential for assisted enzyme mining and directed evolution. To facilitate in silico evaluation of enzyme catalytic efficiency, we have established a web server implementing this model, which can be accessed at http://mathtc.nscc-tj.cn/mpek.


Assuntos
Aprendizado Profundo , Enzimas , Cinética , Enzimas/metabolismo , Enzimas/química , Algoritmos , Biologia Computacional/métodos
3.
ACS Omega ; 9(24): 26213-26221, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38911735

RESUMO

Accurate and rapid evaluation of density is crucial for evaluating the packing and combustion characteristics of high-energy-density fuels (HEDFs). This parameter is pivotal in the selection of high-performance HEDFs. Our study leveraged a polycyclic compound density data set and quantum chemical (QC) descriptors to establish a correlation with the target properties using the XGBoost algorithm. We utilized a recursive feature elimination method to simplify the model and developed a concise and interpretable density prediction model incorporating only six QC descriptors. The model demonstrated robust performance, achieving coefficients of determination (R 2) of 0.967 and 0.971 for internal and external test sets, respectively, and root-mean-square errors (RMSE) of 0.031 and 0.027 g/cm3, respectively. Compared to the other two mainstream methods, the marginal discrepancy between the predicted and actual molecular densities underscores the model's superior predictive ability and more usefulness for energy density calculation. Furthermore, we developed a web server (SesquiterPre, https://sespre.cmdrg.com/#/) that can simultaneously calculate the density, enthalpy of combustion, and energy density of sesquiterpenoid HEDFs, which greatly facilitates the use of researchers and is of great significance for accelerating the design and screening of novel sesquiterpenoid HEDFs.

4.
J Cheminform ; 16(1): 48, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685101

RESUMO

Previous studies have shown that the three-dimensional (3D) geometric and electronic structure of molecules play a crucial role in determining their key properties and intermolecular interactions. Therefore, it is necessary to establish a quantum chemical (QC) property database containing the most stable 3D geometric conformations and electronic structures of molecules. In this study, a high-quality QC property database, called QuanDB, was developed, which included structurally diverse molecular entities and featured a user-friendly interface. Currently, QuanDB contains 154,610 compounds sourced from public databases and scientific literature, with 10,125 scaffolds. The elemental composition comprises nine elements: H, C, O, N, P, S, F, Cl, and Br. For each molecule, QuanDB provides 53 global and 5 local QC properties and the most stable 3D conformation. These properties are divided into three categories: geometric structure, electronic structure, and thermodynamics. Geometric structure optimization and single point energy calculation at the theoretical level of B3LYP-D3(BJ)/6-311G(d)/SMD/water and B3LYP-D3(BJ)/def2-TZVP/SMD/water, respectively, were applied to ensure highly accurate calculations of QC properties, with the computational cost exceeding 107 core-hours. QuanDB provides high-value geometric and electronic structure information for use in molecular representation models, which are critical for machine-learning-based molecular design, thereby contributing to a comprehensive description of the chemical compound space. As a new high-quality dataset for QC properties, QuanDB is expected to become a benchmark tool for the training and optimization of machine learning models, thus further advancing the development of novel drugs and materials. QuanDB is freely available, without registration, at https://quandb.cmdrg.com/ .

5.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38385872

RESUMO

Drug discovery and development constitute a laborious and costly undertaking. The success of a drug hinges not only good efficacy but also acceptable absorption, distribution, metabolism, elimination, and toxicity (ADMET) properties. Overall, up to 50% of drug development failures have been contributed from undesirable ADMET profiles. As a multiple parameter objective, the optimization of the ADMET properties is extremely challenging owing to the vast chemical space and limited human expert knowledge. In this study, a freely available platform called Chemical Molecular Optimization, Representation and Translation (ChemMORT) is developed for the optimization of multiple ADMET endpoints without the loss of potency (https://cadd.nscc-tj.cn/deploy/chemmort/). ChemMORT contains three modules: Simplified Molecular Input Line Entry System (SMILES) Encoder, Descriptor Decoder and Molecular Optimizer. The SMILES Encoder can generate the molecular representation with a 512-dimensional vector, and the Descriptor Decoder is able to translate the above representation to the corresponding molecular structure with high accuracy. Based on reversible molecular representation and particle swarm optimization strategy, the Molecular Optimizer can be used to effectively optimize undesirable ADMET properties without the loss of bioactivity, which essentially accomplishes the design of inverse QSAR. The constrained multi-objective optimization of the poly (ADP-ribose) polymerase-1 inhibitor is provided as the case to explore the utility of ChemMORT.


Assuntos
Aprendizado Profundo , Humanos , Desenvolvimento de Medicamentos , Descoberta de Drogas , Inibidores de Poli(ADP-Ribose) Polimerases
6.
Molecules ; 29(2)2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38257208

RESUMO

TRPV1 channel agonists and antagonists, which have powerful analgesic effects without the addictive qualities associated with traditional analgesics, have become a focus area for the development of novel analgesics. In this study, quantitative structure-activity relationship (QSAR) models for three bioactive endpoints (Ki, IC50, and EC50) were successfully constructed using four machine learning algorithms: SVM, Bagging, GBDT, and XGBoost. These models were based on 2922 TRPV1 modulators and incorporated four types of molecular descriptors: Daylight, E-state, ECFP4, and MACCS. After the rigorous five-fold cross-validation and external test set validation, the optimal models for the three endpoints were obtained. For the Ki endpoint, the Bagging-ECFP4 model had a Q2 value of 0.778 and an R2 value of 0.780. For the IC50 endpoint, the XGBoost-ECFP4 model had a Q2 value of 0.806 and an R2 value of 0.784. For the EC50 endpoint, the SVM-Daylight model had a Q2 value of 0.784 and an R2 value of 0.809. These results demonstrate that the constructed models exhibit good predictive performance. In addition, based on the model feature importance analysis, the influence between substructure and biological activity was also explored, which can provide important theoretical guidance for the efficient virtual screening and structural optimization of novel TRPV1 analgesics. And subsequent studies on novel TRPV1 modulators will be based on the feature substructures of the three endpoints.


Assuntos
Algoritmos , Confiabilidade dos Dados , Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade , Analgésicos/farmacologia
7.
Front Psychiatry ; 14: 1187111, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37680447

RESUMO

Background: Schizophrenia (SCZ) is a serious chronic mental disorder. Our previous case-control genetic association study has shown that microRNA-137 (miR-137) may only protect females against SCZ. Since estrogen, an important female sex hormone, exerts neuroprotective effects, the relationship between estrogen and miR-137 in the pathophysiology of SCZ was further studied in this study. Methods: Genotyping of single-nucleotide polymorphism rs1625579 of miR-137 gene in 1,004 SCZ patients and 896 healthy controls was conducted using the iMLDR assay. The effect of estradiol (E2) on the miR-137 expression was evaluated on the human mammary adenocarcinoma cell line (MCF-7) and the mouse hippocampal neuron cell line (HT22). The relationships between serum E2, prolactin (PRL), and peripheral blood miR-137 were investigated in 41 SCZ patients and 43 healthy controls. The miR-137 and other reference miRNAs were detected by real-time fluorescent quantitative reverse transcription-PCR. Results: Based on the well-known SNP rs1625579, the distributions of protective genotypes and alleles of the miR-137 gene were not different between patients and healthy controls but were marginally significantly lower in female patients. E2 upregulated the expression of miR-137 to 2.83 and 1.81 times in MCF-7 and HT22 cells, respectively. Both serum E2 and blood miR-137 were significantly decreased or downregulated in SCZ patients, but they lacked expected positive correlations with each other in both patients and controls. When stratified by sex, blood miR-137 was negatively correlated with serum E2 in female patients. On the other hand, serum PRL was significantly increased in SCZ patients, and the female patients had the highest serum PRL level and a negative correlation between serum PRL and blood miR-137. Conclusion: The plausible SCZ-protective effect of miR-137 may be female specific, of which the underlying mechanism may be that E2 upregulates the expression of miR-137. This protective mechanism may also be abrogated by elevated PRL in female patients. These preliminary findings suggest a new genetic/environmental interaction mechanism for E2/miR-137 to protect normal females against SCZ and a novel E2/PRL/miR-137-related pathophysiology of female SCZ, implying some new antipsychotic ways for female patients in future.

8.
Int J Mol Sci ; 24(16)2023 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-37629069

RESUMO

Transcription factors containing a CCCH structure (C3H) play important roles in plant growth and development, and their stress response, but research on the C3H gene family in potato has not been reported yet. In this study, we used bioinformatics to identify 50 C3H genes in potato and named them StC3H-1 to StC3H-50 according to their location on chromosomes, and we analyzed their physical and chemical properties, chromosome location, phylogenetic relationship, gene structure, collinearity relationship, and cis-regulatory element. The gene expression pattern analysis showed that many StC3H genes are involved in potato growth and development, and their response to diverse environmental stresses. Furthermore, RT-qPCR data showed that the expression of many StC3H genes was induced by high temperatures, indicating that StC3H genes may play important roles in potato response to heat stress. In addition, Some StC3H genes were predominantly expressed in the stolon and developing tubers, suggesting that these StC3H genes may be involved in the regulation of tuber development. Together, these results provide new information on StC3H genes and will be helpful for further revealing the function of StC3H genes in the heat stress response and tuber development in potato.


Assuntos
Solanum tuberosum , Solanum tuberosum/genética , Filogenia , Biologia Computacional , Perfilação da Expressão Gênica , Dedos de Zinco
9.
Chem Biol Drug Des ; 102(3): 409-423, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37489095

RESUMO

The transient receptor potential vanilloid 1 (TRPV1) channel belongs to the transient receptor potential channel superfamily and participates in many physiological processes. TRPV1 modulators (both agonists and antagonists) can effectively inhibit pain caused by various factors and have curative effects in various diseases, such as itch, cancer, and cardiovascular diseases. Therefore, the development of TRPV1 channel modulators is of great importance. In this study, the structure-based virtual screening and ligand-based virtual screening methods were used to screen compound databases respectively. In the structure-based virtual screening route, a full-length human TRPV1 protein was first constructed, three molecular docking methods with different precisions were performed based on the hTRPV1 structure, and a machine learning-based rescoring model by the XGBoost algorithm was constructed to enrich active compounds. In the ligand-based virtual screening route, the ROCS program was used for 3D shape similarity searching and the EON program was used for electrostatic similarity searching. Final 77 compounds were selected from two routes for in vitro assays. The results showed that 8 of them were identified as active compounds, including three hits with IC50 values close to capsazepine. In addition, one hit is a partial agonist with both agonistic and antagonistic activity. The mechanisms of some active compounds were investigated by molecular dynamics simulation, which explained their agonism or antagonism.


Assuntos
Aprendizado de Máquina , Simulação de Dinâmica Molecular , Humanos , Simulação de Acoplamento Molecular , Ligantes , Canais de Cátion TRPV
10.
Environ Sci Pollut Res Int ; 29(15): 22439-22453, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34787806

RESUMO

The green credit is one of the effective tools to save energy and reduce pollution, which mainly applies in industry. Thus, this paper explores the impact of green credit on the upgrading of China's industrial structure from the perspective of industrial sectors, by means of a dynamic panel model with the dada from 2005 to 2016. The upgrading of industrial structure is divided into three dimensions to be analyzed-rationalization of industrial structure (RIS), advancement of industrial structure (AIS), and greenization of industrial structure (GIS). The empirical results are also explained by four influence mechanisms-resource allocation, technological innovation, credit catalysis, and policy guidance mechanism. This paper finds that on the national level, green credit has a positive impact on the upgrading of China's industrial structure and plays a significant role in promoting the greenization and advancement of industrial structure. However, on the regional level, the effect of green credit is more complex. First, green credit has a significant positive effect on the GIS in the eastern, central, and western regions of China, which suggests that green credit is conductive to the cleaner production of industry across the country. Second, green credit also has a positive impact on the AIS in these three regions, but the effect is only significant in the eastern region. Third, in terms of the RIS, the effect of green credit is positive but not significant in the eastern and central regions. However, it is negative, not significant as well, in the western region, which can be explained from the perspective of the resource allocation and technological innovation mechanism. In addition, there is a significant positive correlation between the previous period and the current value of RIS, AIS, and GIS, which indicates that there is a significant positive inertia dynamic feature in the upgrading of China's industrial structure.


Assuntos
Indústrias , Invenções , China , Poluição Ambiental , Políticas
11.
Nucleic Acids Res ; 50(D1): D1200-D1207, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34634800

RESUMO

Drug-drug interaction (DDI) can trigger many adverse effects in patients and has emerged as a threat to medicine and public health. Despite the continuous information accumulation of clinically significant DDIs, there are few open-access knowledge systems dedicated to the curation of DDI associations. To facilitate the clinicians to screen for dangerous drug combinations and improve health systems, we present DDInter, a curated DDI database with comprehensive data, practical medication guidance, intuitive function interface, and powerful visualization to the scientific community. Currently, DDInter contains about 0.24M DDI associations connecting 1833 approved drugs (1972 entities). Each drug is annotated with basic chemical and pharmacological information and its interaction network. For DDI associations, abundant and professional annotations are provided, including severity, mechanism description, strategies for managing potential side effects, alternative medications, etc. The drug entities and interaction entities are efficiently cross-linked. In addition to basic query and browsing, the prescription checking function is developed to facilitate clinicians to decide whether drugs combinations can be used safely. It can also be used for informatics-based DDI investigation and evaluation of other prediction frameworks. We hope that DDInter will prove useful in improving clinical decision-making and patient safety. DDInter is freely available, without registration, at http://ddinter.scbdd.com/.


Assuntos
Bases de Dados Factuais , Interações Medicamentosas/genética , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/classificação , Software , Tomada de Decisão Clínica , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/genética , Humanos , Segurança do Paciente
12.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34427296

RESUMO

Computational methods have become indispensable tools to accelerate the drug discovery process and alleviate the excessive dependence on time-consuming and labor-intensive experiments. Traditional feature-engineering approaches heavily rely on expert knowledge to devise useful features, which could be costly and sometimes biased. The emerging deep learning (DL) methods deliver a data-driven method to automatically learn expressive representations from complex raw data. Inspired by this, researchers have attempted to apply various deep neural network models to simplified molecular input line entry specification (SMILES) strings, which contain all the composition and structure information of molecules. However, current models usually suffer from the scarcity of labeled data. This results in a low generalization ability of SMILES-based DL models, which prevents them from competing with the state-of-the-art computational methods. In this study, we utilized the BiLSTM (bidirectional long short term merory) attention network (BAN) in which we employed a novel multi-step attention mechanism to facilitate the extracting of key features from the SMILES strings. Meanwhile, SMILES enumeration was utilized as a data augmentation method in the training phase to substantially increase the number of labeled data and enlarge the probability of mining more patterns from complex SMILES. We again took advantage of SMILES enumeration in the prediction phase to rectify model prediction bias and provide a more accurate prediction. Combined with the BAN model, our strategies can greatly improve the performance of latent features learned from SMILES strings. In 11 canonical absorption, distribution, metabolism, excretion and toxicity-related tasks, our method outperformed the state-of-the-art approaches.


Assuntos
Quimioinformática/métodos , Aprendizado Profundo , Descoberta de Drogas/métodos , Software , Algoritmos , Desenvolvimento de Medicamentos , Projetos de Pesquisa
13.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33940596

RESUMO

The poly (ADP-ribose) polymerase-1 (PARP1) has been regarded as a vital target in recent years and PARP1 inhibitors can be used for ovarian and breast cancer therapies. However, it has been realized that most of PARP1 inhibitors have disadvantages of low solubility and permeability. Therefore, by discovering more molecules with novel frameworks, it would have greater opportunities to apply it into broader clinical fields and have a more profound significance. In the present study, multiple virtual screening (VS) methods had been employed to evaluate the screening efficiency of ligand-based, structure-based and data fusion methods on PARP1 target. The VS methods include 2D similarity screening, structure-activity relationship (SAR) models, docking and complex-based pharmacophore screening. Moreover, the sum rank, sum score and reciprocal rank were also adopted for data fusion methods. The evaluation results show that the similarity searching based on Torsion fingerprint, six SAR models, Glide docking and pharmacophore screening using Phase have excellent screening performance. The best data fusion method is the reciprocal rank, but the sum score also performs well in framework enrichment. In general, the ligand-based VS methods show better performance on PARP1 inhibitor screening. These findings confirmed that adding ligand-based methods to the early screening stage will greatly improve the screening efficiency, and be able to enrich more highly active PARP1 inhibitors with diverse structures.


Assuntos
Bases de Dados de Compostos Químicos , Simulação de Acoplamento Molecular , Poli(ADP-Ribose) Polimerase-1/antagonistas & inibidores , Inibidores de Poli(ADP-Ribose) Polimerases/química , Avaliação Pré-Clínica de Medicamentos , Humanos , Poli(ADP-Ribose) Polimerase-1/química , Relação Estrutura-Atividade
14.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33951729

RESUMO

MOTIVATION: Accurate and efficient prediction of molecular properties is one of the fundamental issues in drug design and discovery pipelines. Traditional feature engineering-based approaches require extensive expertise in the feature design and selection process. With the development of artificial intelligence (AI) technologies, data-driven methods exhibit unparalleled advantages over the feature engineering-based methods in various domains. Nevertheless, when applied to molecular property prediction, AI models usually suffer from the scarcity of labeled data and show poor generalization ability. RESULTS: In this study, we proposed molecular graph BERT (MG-BERT), which integrates the local message passing mechanism of graph neural networks (GNNs) into the powerful BERT model to facilitate learning from molecular graphs. Furthermore, an effective self-supervised learning strategy named masked atoms prediction was proposed to pretrain the MG-BERT model on a large amount of unlabeled data to mine context information in molecules. We found the MG-BERT model can generate context-sensitive atomic representations after pretraining and transfer the learned knowledge to the prediction of a variety of molecular properties. The experimental results show that the pretrained MG-BERT model with a little extra fine-tuning can consistently outperform the state-of-the-art methods on all 11 ADMET datasets. Moreover, the MG-BERT model leverages attention mechanisms to focus on atomic features essential to the target property, providing excellent interpretability for the trained model. The MG-BERT model does not require any hand-crafted feature as input and is more reliable due to its excellent interpretability, providing a novel framework to develop state-of-the-art models for a wide range of drug discovery tasks.


Assuntos
Modelos Teóricos , Redes Neurais de Computação
15.
Nucleic Acids Res ; 49(W1): W5-W14, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-33893803

RESUMO

Because undesirable pharmacokinetics and toxicity of candidate compounds are the main reasons for the failure of drug development, it has been widely recognized that absorption, distribution, metabolism, excretion and toxicity (ADMET) should be evaluated as early as possible. In silico ADMET evaluation models have been developed as an additional tool to assist medicinal chemists in the design and optimization of leads. Here, we announced the release of ADMETlab 2.0, a completely redesigned version of the widely used AMDETlab web server for the predictions of pharmacokinetics and toxicity properties of chemicals, of which the supported ADMET-related endpoints are approximately twice the number of the endpoints in the previous version, including 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules (751 substructures). A multi-task graph attention framework was employed to develop the robust and accurate models in ADMETlab 2.0. The batch computation module was provided in response to numerous requests from users, and the representation of the results was further optimized. The ADMETlab 2.0 server is freely available, without registration, at https://admetmesh.scbdd.com/.


Assuntos
Farmacocinética , Software , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Internet , Preparações Farmacêuticas/química , Ftalazinas/química , Ftalazinas/farmacocinética , Ftalazinas/toxicidade , Piperazinas/química , Piperazinas/farmacocinética , Piperazinas/toxicidade
16.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33709154

RESUMO

BACKGROUND: Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed. RESULTS: In this study, PySmash, a user-friendly and powerful tool to generate different types of representative substructures, was developed. The current version of PySmash provides both a Python package and an individual executable program, which achieves ease of operation and pipeline integration. Three types of substructure generation algorithms, including circular, path-based and functional group-based algorithms, are provided. Users can conveniently customize their own requirements for substructure size, accuracy and coverage, statistical significance and parallel computation during execution. Besides, PySmash provides the function for external data screening. CONCLUSION: PySmash, a user-friendly and integrated tool for the automatic generation and implementation of representative substructures, is presented. Three screening examples, including toxicophore derivation, privileged motif detection and the integration of substructures with machine learning (ML) models, are provided to illustrate the utility of PySmash in safety profile evaluation, therapeutic activity exploration and molecular optimization, respectively. Its executable program and Python package are available at https://github.com/kotori-y/pySmash.


Assuntos
Biologia Computacional/métodos , Descoberta de Drogas/métodos , Aprendizado de Máquina , Software , Testes de Carcinogenicidade/métodos , Carcinógenos , Ensaios de Seleção de Medicamentos Antitumorais/métodos , Humanos
17.
Drug Discov Today ; 26(6): 1353-1358, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33581116

RESUMO

In 2010, the pan-assay interference compounds (PAINS) rule was proposed to identify false-positive compounds, especially frequent hitters (FHs), in biological screening campaigns, and has rapidly become an essential component in drug design. However, the specific mechanisms remain unknown, and the result validation and follow-up processing schemes are still unclear. In this review, a large benchmark collection of >600,000 compounds sourced from databases and the literature, including six common false-positive mechanisms, was used to evaluate the detection ability of PAINS. In addition, 400 million purchasable molecules from the ZINC database were also applied to PAINS screening. The results indicate that the PAINS rule is not suitable for the screening of all types of false-positive results and needs more improvement.


Assuntos
Bases de Dados Factuais , Desenho de Fármacos , Ensaios de Triagem em Larga Escala/métodos , Benchmarking , Descoberta de Drogas/métodos , Humanos
18.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33418563

RESUMO

Matched molecular pairs analysis (MMPA) has become a powerful tool for automatically and systematically identifying medicinal chemistry transformations from compound/property datasets. However, accurate determination of matched molecular pair (MMP) transformations largely depend on the size and quality of existing experimental data. Lack of high-quality experimental data heavily hampers the extraction of more effective medicinal chemistry knowledge. Here, we developed a new strategy called quantitative structure-activity relationship (QSAR)-assisted-MMPA to expand the number of chemical transformations and took the logD7.4 property endpoint as an example to demonstrate the reliability of the new method. A reliable logD7.4 consensus prediction model was firstly established, and its applicability domain was strictly assessed. By applying the reliable logD7.4 prediction model to screen two chemical databases, we obtained more high-quality logD7.4 data by defining a strict applicability domain threshold. Then, MMPA was performed on the predicted data and experimental data to derive more chemical rules. To validate the reliability of the chemical rules, we compared the magnitude and directionality of the property changes of the predicted rules with those of the measured rules. Then, we compared the novel chemical rules generated by our proposed approach with the published chemical rules, and found that the magnitude and directionality of the property changes were consistent, indicating that the proposed QSAR-assisted-MMPA approach has the potential to enrich the collection of rule types or even identify completely novel rules. Finally, we found that the number of the MMP rules derived from the experimental data could be amplified by the predicted data, which is helpful for us to analyze the medicinal chemical rules in local chemical environment. In summary, the proposed QSAR-assisted-MMPA approach could be regarded as a very promising strategy to expand the chemical transformation space for lead optimization, especially when no enough experimental data can support MMPA.


Assuntos
Técnicas de Química Sintética/métodos , Química Farmacêutica/métodos , Descoberta de Drogas/métodos , Drogas em Investigação/síntese química , Modelos Estatísticos , Biotransformação , Bases de Dados de Compostos Químicos , Conjuntos de Dados como Assunto , Descoberta de Drogas/estatística & dados numéricos , Drogas em Investigação/metabolismo , Humanos , Estrutura Molecular , Relação Quantitativa Estrutura-Atividade , Reprodutibilidade dos Testes
19.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32892221

RESUMO

BACKGROUND: High-throughput screening (HTS) and virtual screening (VS) have been widely used to identify potential hits from large chemical libraries. However, the frequent occurrence of 'noisy compounds' in the screened libraries, such as compounds with poor drug-likeness, poor selectivity or potential toxicity, has greatly weakened the enrichment capability of HTS and VS campaigns. Therefore, the development of comprehensive and credible tools to detect noisy compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS: In this study, we developed a freely available integrated python library for negative design, called Scopy, which supports the functions of data preparation, calculation of descriptors, scaffolds and screening filters, and data visualization. The current version of Scopy can calculate 39 basic molecular properties, 3 comprehensive molecular evaluation scores, 2 types of molecular scaffolds, 6 types of substructure descriptors and 2 types of fingerprints. A number of important screening rules are also provided by Scopy, including 15 drug-likeness rules (13 drug-likeness rules and 2 building block rules), 8 frequent hitter rules (four assay interference substructure filters and four promiscuous compound substructure filters), and 11 toxicophore filters (five human-related toxicity substructure filters, three environment-related toxicity substructure filters and three comprehensive toxicity substructure filters). Moreover, this library supports four different visualization functions to help users to gain a better understanding of the screened data, including basic feature radar chart, feature-feature-related scatter diagram, functional group marker gram and cloud gram. CONCLUSION: Scopy provides a comprehensive Python package to filter out compounds with undesirable properties or substructures, which will benefit the design of high-quality chemical libraries for drug design and discovery. It is freely available at https://github.com/kotori-y/Scopy.


Assuntos
Bases de Dados de Produtos Farmacêuticos/estatística & dados numéricos , Desenho de Fármacos , Desenvolvimento de Medicamentos/métodos , Ensaios de Triagem em Larga Escala/métodos , Bibliotecas de Moléculas Pequenas , Produtos Biológicos/química , Biologia Computacional/métodos , Descoberta de Drogas/métodos , Estabilidade de Medicamentos , Humanos , Estrutura Molecular , Preparações Farmacêuticas/química , Reprodutibilidade dos Testes , Projetos de Pesquisa
20.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33201188

RESUMO

BACKGROUND: Fluorescent detection methods are indispensable tools for chemical biology. However, the frequent appearance of potential fluorescent compound has greatly interfered with the recognition of compounds with genuine activity. Such fluorescence interference is especially difficult to identify as it is reproducible and possesses concentration-dependent characteristic. Therefore, the development of a credible screening tool to detect fluorescent compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS: In this study, we developed a webserver ChemFLuo for fluorescent compound detection, based on two large and high-quality training datasets containing 4906 blue and 8632 green fluorescent compounds. These molecules were used to construct a group of prediction models based on the combination of three machine learning algorithms and seven types of molecular representations. The best blue fluorescence prediction model achieved with balanced accuracy (BA) = 0.858 and area under the receiver operating characteristic curve (AUC) = 0.931 for the validation set, and BA = 0.823 and AUC = 0.903 for the test set. The best green fluorescence prediction model achieved the prediction accuracy with BA = 0.810 and AUC = 0.887 for the validation set, and BA = 0.771 and AUC = 0.852 for the test set. Besides prediction model, 22 blue and 16 green representative fluorescent substructures were summarized for the screening of potential fluorescent compounds. The comparison with other fluorescence detection tools and theapplication to external validation sets and large molecule libraries have demonstrated the reliability of prediction model for fluorescent compound detection. CONCLUSION: ChemFLuo is a public webserver to filter out compounds with undesirable fluorescent properties, which will benefit the design of high-quality chemical libraries for drug discovery. It is freely available at http://admet.scbdd.com/chemfluo/index/.


Assuntos
Descoberta de Drogas , Corantes Fluorescentes/química , Aprendizado de Máquina , Modelos Químicos , Bibliotecas de Moléculas Pequenas , Fluorescência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA