Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 96
Filtrar
1.
J Chem Inf Model ; 64(8): 3222-3236, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38498003

RESUMO

Liver microsomal stability, a crucial aspect of metabolic stability, significantly impacts practical drug discovery. However, current models for predicting liver microsomal stability are based on limited molecular information from a single species. To address this limitation, we constructed the largest public database of compounds from three common species: human, rat, and mouse. Subsequently, we developed a series of classification models using both traditional descriptor-based and classic graph-based machine learning (ML) algorithms. Remarkably, the best-performing models for the three species achieved Matthews correlation coefficients (MCCs) of 0.616, 0.603, and 0.574, respectively, on the test set. Furthermore, through the construction of consensus models based on these individual models, we have demonstrated their superior predictive performance in comparison with the existing models of the same type. To explore the similarities and differences in the properties of liver microsomal stability among multispecies molecules, we conducted preliminary interpretative explorations using the Shapley additive explanations (SHAP) and atom heatmap approaches for the models and misclassified molecules. Additionally, we further investigated representative structural modifications and substructures that decrease the liver microsomal stability in different species using the matched molecule pair analysis (MMPA) method and substructure extraction techniques. The established prediction models, along with insightful interpretation information regarding liver microsomal stability, will significantly contribute to enhancing the efficiency of exploring practical drugs for development.


Assuntos
Inteligência Artificial , Microssomos Hepáticos , Microssomos Hepáticos/metabolismo , Animais , Camundongos , Ratos , Humanos , Aprendizado de Máquina , Descoberta de Drogas/métodos , Preparações Farmacêuticas/metabolismo , Preparações Farmacêuticas/química
2.
Comput Methods Programs Biomed ; 248: 108137, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38520784

RESUMO

BACKGROUND AND OBJECTIVE: Clinical pharmacological modeling and statistical analysis software is an essential basic tool for drug development and personalized drug therapy. The learning curve of current basic tools is steep and unfriendly to beginners. The curve is even more challenging in cases of significant individual differences or measurement errors in data, resulting in difficulties in accurately estimating pharmacokinetic parameters by existing fitting algorithms. Hence, this study aims to explore a new optimized parameter fitting algorithm that reduces the sensitivity of the model to initial values and integrate it into the CPhaMAS platform, a user-friendly online application for pharmacokinetic data analysis. METHODS: In this study, we proposed an optimized Nelder-Mead method that reinitializes simplex vertices when trapped in local solutions and integrated it into the CPhaMAS platform. The CPhaMAS, an online platform for pharmacokinetic data analysis, includes three modules: compartment model analysis, non-compartment analysis (NCA) and bioequivalence/bioavailability (BE/BA) analysis. Our proposed CPhaMAS platform was evaluated and compared with existing WinNonlin. RESULTS: The platform was easy to learn and did not require code programming. The accuracy investigation found that the optimized Nelder-Mead method of the CPhaMAS platform showed better accuracy (smaller mean relative error and higher R2) in two-compartment and extravascular administration models when the initial value was set to true and abnormal values (10 times larger or smaller than the true value) compared with the WinNonlin. The mean relative error of the NCA calculation parameters of CPhaMAS and WinNonlin was <0.0001 %. When calculating BE for conventional, high-variability and narrow-therapeutic drugs. The main statistical parameters of the parameters Cmax, AUCt, and AUCinf in CPhaMAS have a mean relative error of <0.01% compared to WinNonLin. CONCLUSIONS: In summary, CPhaMAS is a user-friendly platform with relatively accurate algorithms. It is a powerful tool for analysing pharmacokinetic data for new drug development and precision medicine.


Assuntos
Algoritmos , Software , Modelos Teóricos , Preparações Farmacêuticas , Projetos de Pesquisa
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38385872

RESUMO

Drug discovery and development constitute a laborious and costly undertaking. The success of a drug hinges not only good efficacy but also acceptable absorption, distribution, metabolism, elimination, and toxicity (ADMET) properties. Overall, up to 50% of drug development failures have been contributed from undesirable ADMET profiles. As a multiple parameter objective, the optimization of the ADMET properties is extremely challenging owing to the vast chemical space and limited human expert knowledge. In this study, a freely available platform called Chemical Molecular Optimization, Representation and Translation (ChemMORT) is developed for the optimization of multiple ADMET endpoints without the loss of potency (https://cadd.nscc-tj.cn/deploy/chemmort/). ChemMORT contains three modules: Simplified Molecular Input Line Entry System (SMILES) Encoder, Descriptor Decoder and Molecular Optimizer. The SMILES Encoder can generate the molecular representation with a 512-dimensional vector, and the Descriptor Decoder is able to translate the above representation to the corresponding molecular structure with high accuracy. Based on reversible molecular representation and particle swarm optimization strategy, the Molecular Optimizer can be used to effectively optimize undesirable ADMET properties without the loss of bioactivity, which essentially accomplishes the design of inverse QSAR. The constrained multi-objective optimization of the poly (ADP-ribose) polymerase-1 inhibitor is provided as the case to explore the utility of ChemMORT.


Assuntos
Aprendizado Profundo , Humanos , Desenvolvimento de Medicamentos , Descoberta de Drogas , Inibidores de Poli(ADP-Ribose) Polimerases
4.
J Med Chem ; 67(2): 1347-1359, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38181431

RESUMO

Patents play a crucial role in drug research and development, providing early access to unpublished data and offering unique insights. Identifying key compounds in patents is essential to finding novel lead compounds. This study collected a comprehensive data set comprising 1555 patents, encompassing 1000 key compounds, to explore innovative approaches for predicting these key compounds. Our novel PatentNetML framework integrated network science and machine learning algorithms, combining network measures, ADMET properties, and physicochemical properties, to construct robust classification models to identify key compounds. Through a model interpretation and an analysis of three compelling case studies, we showcase the potential of PatentNetML in unveiling hidden patterns and connections within diverse patents. While our framework is pioneering, we acknowledge its limitations when applied to patents that deviate from the assumed central pattern. This work serves as a promising foundation for future research endeavors aimed at efficiently identifying promising drug candidates and expediting drug discovery in the pharmaceutical industry.


Assuntos
Algoritmos , Aprendizado de Máquina , Descoberta de Drogas , Indústria Farmacêutica
5.
Bioinformatics ; 40(1)2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38243703

RESUMO

MOTIVATION: Spatial clustering is essential and challenging for spatial transcriptomics' data analysis to unravel tissue microenvironment and biological function. Graph neural networks are promising to address gene expression profiles and spatial location information in spatial transcriptomics to generate latent representations. However, choosing an appropriate graph deep learning module and graph neural network necessitates further exploration and investigation. RESULTS: In this article, we present GRAPHDeep to assemble a spatial clustering framework for heterogeneous spatial transcriptomics data. Through integrating 2 graph deep learning modules and 20 graph neural networks, the most appropriate combination is decided for each dataset. The constructed spatial clustering method is compared with state-of-the-art algorithms to demonstrate its effectiveness and superiority. The significant new findings include: (i) the number of genes or proteins of spatial omics data is quite crucial in spatial clustering algorithms; (ii) the variational graph autoencoder is more suitable for spatial clustering tasks than deep graph infomax module; (iii) UniMP, SAGE, SuperGAT, GATv2, GCN, and TAG are the recommended graph neural networks for spatial clustering tasks; and (iv) the used graph neural network in the existent spatial clustering frameworks is not the best candidate. This study could be regarded as desirable guidance for choosing an appropriate graph neural network for spatial clustering. AVAILABILITY AND IMPLEMENTATION: The source code of GRAPHDeep is available at https://github.com/narutoten520/GRAPHDeep. The studied spatial omics data are available at https://zenodo.org/record/8141084.


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Redes Neurais de Computação , Software , Análise por Conglomerados
6.
J Chem Inf Model ; 64(1): 96-109, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38132638

RESUMO

Detecting drug-drug interactions (DDIs) is an essential step in drug development and drug administration. Given the shortcomings of current experimental methods, the machine learning (ML) approach has become a reliable alternative, attracting extensive attention from the academic and industrial fields. With the rapid development of computational science and the growing popularity of cross-disciplinary research, a large number of DDI prediction studies based on ML methods have been published in recent years. To give an insight into the current situation and future direction of DDI prediction research, we systemically review these studies from three aspects: (1) the classic DDI databases, mainly including databases of drugs, side effects, and DDI information; (2) commonly used drug attributes, which focus on chemical, biological, and phenotypic attributes for representing drugs; (3) popular ML approaches, such as shallow learning-based, deep learning-based, recommender system-based, and knowledge graph-based methods for DDI detection. For each section, related studies are described, summarized, and compared, respectively. In the end, we conclude the research status of DDI prediction based on ML methods and point out the existing issues, future challenges, potential opportunities, and subsequent research direction.


Assuntos
Bases de Conhecimento , Aprendizado de Máquina , Interações Medicamentosas , Preparações Farmacêuticas , Bases de Dados Factuais
7.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37401373

RESUMO

Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, natural language processing based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.


Assuntos
Inteligência Artificial , Redes Neurais de Computação , Humanos , Interações Medicamentosas , Processamento de Linguagem Natural , Descoberta de Drogas
8.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37344167

RESUMO

Adverse drug events (ADEs) are common in clinical practice and can cause significant harm to patients and increase resource use. Natural language processing (NLP) has been applied to automate ADE detection, but NLP systems become less adaptable when drug entities are missing or multiple medications are specified in clinical narratives. Additionally, no Chinese-language NLP system has been developed for ADE detection due to the complexity of Chinese semantics, despite ˃10 million cases of drug-related adverse events occurring annually in China. To address these challenges, we propose DKADE, a deep learning and knowledge graph-based framework for identifying ADEs. DKADE infers missing drug entities and evaluates their correlations with ADEs by combining medication orders and existing drug knowledge. Moreover, DKADE can automatically screen for new adverse drug reactions. Experimental results show that DKADE achieves an overall F1-score value of 91.13%. Furthermore, the adaptability of DKADE is validated using real-world external clinical data. In summary, DKADE is a powerful tool for studying drug safety and automating adverse event monitoring.


Assuntos
Aprendizado Profundo , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Reconhecimento Automatizado de Padrão , Semântica , Processamento de Linguagem Natural
9.
J Cheminform ; 15(1): 48, 2023 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-37088813

RESUMO

Identification and validation of bioactive small-molecule targets is a significant challenge in drug discovery. In recent years, various in-silico approaches have been proposed to expedite time- and resource-consuming experiments for target detection. Herein, we developed several chemogenomic models for target prediction based on multi-scale information of chemical structures and protein sequences. By combining the information of a compound with multiple protein targets together and putting these compound-target pairs into a well-established model, the scores to indicate whether there are interactions between compounds and targets can be derived, and thus a target prediction task can be completed by sorting the outputted scores. To improve the prediction performance, we constructed several chemogenomic models using multi-scale information of chemical structures and protein sequences, and the ensemble model with the best performance was used as our final model. The model was validated by various strategies and external datasets and the promising target prediction capability of the model, i.e., the fraction of known targets identified in the top-k (1 to 10) list of the potential target candidates suggested by the model, was confirmed. Compared with multiple state-of-art target prediction methods, our model showed equivalent or better predictive ability in terms of the top-k predictions. It is expected that our method can be utilized as a powerful computational tool to narrow down the potential targets for experimental testing.

10.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37080761

RESUMO

Advancing spatially resolved transcriptomics (ST) technologies help biologists comprehensively understand organ function and tissue microenvironment. Accurate spatial domain identification is the foundation for delineating genome heterogeneity and cellular interaction. Motivated by this perspective, a graph deep learning (GDL) based spatial clustering approach is constructed in this paper. First, the deep graph infomax module embedded with residual gated graph convolutional neural network is leveraged to address the gene expression profiles and spatial positions in ST. Then, the Bayesian Gaussian mixture model is applied to handle the latent embeddings to generate spatial domains. Designed experiments certify that the presented method is superior to other state-of-the-art GDL-enabled techniques on multiple ST datasets. The codes and dataset used in this manuscript are summarized at https://github.com/narutoten520/SCGDL.


Assuntos
Aprendizado Profundo , Transcriptoma , Teorema de Bayes , Perfilação da Expressão Gênica , Comunicação Celular
11.
J Chem Inf Model ; 63(8): 2345-2359, 2023 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-37000044

RESUMO

The n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4) is an indicator of lipophilicity, and it influences a wide variety of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties and druggability of compounds. In log D7.4 prediction, graph neural networks (GNNs) can uncover subtle structure-property relationships (SPRs) by automatically extracting features from molecular graphs that facilitate the learning of SPRs, but their performances are often limited by the small size of available datasets. Herein, we present a transfer learning strategy called pretraining on computational data and then fine-tuning on experimental data (PCFE) to fully exploit the predictive potential of GNNs. PCFE works by pretraining a GNN model on 1.71 million computational log D data (low-fidelity data) and then fine-tuning it on 19,155 experimental log D7.4 data (high-fidelity data). The experiments for three GNN architectures (graph convolutional network (GCN), graph attention network (GAT), and Attentive FP) demonstrated the effectiveness of PCFE in improving GNNs for log D7.4 predictions. Moreover, the optimal PCFE-trained GNN model (cx-Attentive FP, Rtest2 = 0.909) outperformed four excellent descriptor-based models (random forest (RF), gradient boosting (GB), support vector machine (SVM), and extreme gradient boosting (XGBoost)). The robustness of the cx-Attentive FP model was also confirmed by evaluating the models with different training data sizes and dataset splitting strategies. Therefore, we developed a webserver and defined the applicability domain for this model. The webserver (http://tools.scbdd.com/chemlogd/) provides free log D7.4 prediction services. In addition, the important descriptors for log D7.4 were detected by the Shapley additive explanations (SHAP) method, and the most relevant substructures of log D7.4 were identified by the attention mechanism. Finally, the matched molecular pair analysis (MMPA) was performed to summarize the contributions of common chemical substituents to log D7.4, including a variety of hydrocarbon groups, halogen groups, heteroatoms, and polar groups. In conclusion, we believe that the cx-Attentive FP model can serve as a reliable tool to predict log D7.4 and hope that pretraining on low-fidelity data can help GNNs make accurate predictions of other endpoints in drug discovery.


Assuntos
Descoberta de Drogas , Halogênios , 1-Octanol , Aprendizagem , Redes Neurais de Computação
12.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36642412

RESUMO

Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.


Assuntos
Proteínas , Proteínas/metabolismo , Bases de Dados Factuais , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica
13.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36681902

RESUMO

Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.


Assuntos
Algoritmos , Ligantes
14.
J Chem Inf Model ; 63(1): 111-125, 2023 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-36472475

RESUMO

Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.


Assuntos
Aprendizado Profundo , Simulação por Computador , Aprendizado de Máquina , Algoritmos , Descoberta de Drogas
15.
J Cheminform ; 14(1): 89, 2022 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-36587232

RESUMO

Traditional Chinese Medicine (TCM) has been widely used in the treatment of various diseases for millennia. In the modernization process of TCM, TCM ingredient databases are playing more and more important roles. However, most of the existing TCM ingredient databases do not provide simplification function for extracting key ingredients in each herb or formula, which hinders the research on the mechanism of actions of the ingredients in TCM databases. The lack of quality control and standardization of the data in most of these existing databases is also a prominent disadvantage. Therefore, we developed a Traditional Chinese Medicine Simplified Integrated Database (TCMSID) with high storage, high quality and standardization. The database includes 499 herbs registered in the Chinese pharmacopeia with 20,015 ingredients, 3270 targets as well as corresponding detailed information. TCMSID is not only a database of herbal ingredients, but also a TCM simplification platform. Key ingredients from TCM herbs are available to be screened out and regarded as representatives to explore the mechanism of TCM herbs by implementing multi-tool target prediction and multilevel network construction. TCMSID provides abundant data sources and analysis platforms for TCM simplification and drug discovery, which is expected to promote modernization and internationalization of TCM and enhance its international status in the future. TCMSID is freely available at https://tcm.scbdd.com .

16.
J Cheminform ; 14(1): 23, 2022 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-35428354

RESUMO

Drug-drug interaction (DDI) often causes serious adverse reactions and thus results in inestimable economic and social loss. Currently, comprehensive DDI evaluation has become a major challenge in pharmaceutical research due to the time-consuming and costly process of the experimental assessment and it is of high necessity to develop effective in silico methods to predict and evaluate DDIs accurately and efficiently. In this study, based on a large number of substrates and inhibitors related to five important CYP450 isozymes (CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4), a series of high-performance predictive models for metabolic DDIs were constructed by two machine learning methods (random forest and XGBoost) and 4 different types of descriptors (MOE_2D, CATS, ECFP4 and MACCS). To reduce the uncertainty of individual models, the consensus method was applied to yield more reliable predictions. A series of evaluations illustrated that the consensus models were more reliable and robust for the DDI predictions of new drug combination. For the internal validation, the whole prediction accuracy and AUC value of the DDI models were around 0.8 and 0.9, respectively. When it was applied to the external datasets, the model accuracy was 0.793 and 0.795 for multi-level validation and external validation, respectively. Furthermore, we also compared our model with some recently published tools and then applied the final model to predict FDA-approved drugs and proposed 54,013 possible drug pairs with potential DDIs. In summary, we developed a powerful DDI predictive model from the perspective of the CYP450 enzyme family and it will help a lot in the future drug development and clinical pharmacy research.

17.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35212357

RESUMO

Structural information for chemical compounds is often described by pictorial images in most scientific documents, which cannot be easily understood and manipulated by computers. This dilemma makes optical chemical structure recognition (OCSR) an essential tool for automatically mining knowledge from an enormous amount of literature. However, existing OCSR methods fall far short of our expectations for realistic requirements due to their poor recovery accuracy. In this paper, we developed a deep neural network model named ABC-Net (Atom and Bond Center Network) to predict graph structures directly. Based on the divide-and-conquer principle, we propose to model an atom or a bond as a single point in the center. In this way, we can leverage a fully convolutional neural network (CNN) to generate a series of heat-maps to identify these points and predict relevant properties, such as atom types, atom charges, bond types and other properties. Thus, the molecular structure can be recovered by assembling the detected atoms and bonds. Our approach integrates all the detection and property prediction tasks into a single fully CNN, which is scalable and capable of processing molecular images quite efficiently. Experimental results demonstrate that our method could achieve a significant improvement in recognition performance compared with publicly available tools. The proposed method could be considered as a promising solution to OCSR problems and a starting point for the acquisition of molecular information in the literature.


Assuntos
Aprendizado Profundo , Estrutura Molecular , Redes Neurais de Computação
18.
Food Chem ; 372: 131249, 2022 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-34634587

RESUMO

Nowadays, computational approaches have drawn more and more attention when exploring the relationship between sweetness and chemical structure instead of traditional experimental tests. In this work, we proposed a novel multi-layer sweetness evaluation system based on machine learning methods. It can be used to evaluate sweet properties of compounds with different chemical spaces and categories, including natural, artificial, carbohydrate, non-carbohydrate, nutritive and non-nutritive ones, suitable for different application scenarios. Furthermore, it provided quantitative predictions of sweetness. In addition, sweetness-related chemical basis and structure transforming rules were obtained by using molecular cloud and matched molecular pair analysis (MMPA) methods. This work systematically improved the data quality, explored the best machine learning algorithm and molecular characterizing strategy, and finally obtained robust models to establish a multi-layer prediction system (available at: https://github.com/ifyoungnet/ChemSweet). We hope that this study could facilitate food scientists with efficient screening and precise development of high-quality sweeteners.


Assuntos
Edulcorantes , Paladar , Aprendizado de Máquina
19.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34849567

RESUMO

MOTIVATION: Understanding chemical-gene interactions (CGIs) is crucial for screening drugs. Wet experiments are usually costly and laborious, which limits relevant studies to a small scale. On the contrary, computational studies enable efficient in-silico exploration. For the CGI prediction problem, a common method is to perform systematic analyses on a heterogeneous network involving various biomedical entities. Recently, graph neural networks become popular in the field of relation prediction. However, the inherent heterogeneous complexity of biological interaction networks and the massive amount of data pose enormous challenges. This paper aims to develop a data-driven model that is capable of learning latent information from the interaction network and making correct predictions. RESULTS: We developed BioNet, a deep biological networkmodel with a graph encoder-decoder architecture. The graph encoder utilizes graph convolution to learn latent information embedded in complex interactions among chemicals, genes, diseases and biological pathways. The learning process is featured by two consecutive steps. Then, embedded information learnt by the encoder is then employed to make multi-type interaction predictions between chemicals and genes with a tensor decomposition decoder based on the RESCAL algorithm. BioNet includes 79 325 entities as nodes, and 34 005 501 relations as edges. To train such a massive deep graph model, BioNet introduces a parallel training algorithm utilizing multiple Graphics Processing Unit (GPUs). The evaluation experiments indicated that BioNet exhibits outstanding prediction performance with a best area under Receiver Operating Characteristic (ROC) curve of 0.952, which significantly surpasses state-of-theart methods. For further validation, top predicted CGIs of cancer and COVID-19 by BioNet were verified by external curated data and published literature.


Assuntos
Biologia Computacional , Simulação por Computador , Modelos Biológicos , Redes Neurais de Computação
20.
Acta Pharmacol Sin ; 43(6): 1605-1615, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34667293

RESUMO

Decaprenylphosphoryl-ß-D-ribose oxidase (DprE1) plays important roles in the biosynthesis of mycobacterium cell wall. DprE1 inhibitors have shown great potentials in the development of new regimens for tuberculosis (TB) treatment. In this study, an integrated molecular modeling strategy, which combined computational bioactivity fingerprints and structure-based virtual screening, was employed to identify potential DprE1 inhibitors. Two lead compounds (B2 and H3) that could inhibit DprE1 and thus kill Mycobacterium smegmatis in vitro were identified. Moreover, compound H3 showed potent inhibitory activity against Mycobacterium tuberculosis in vitro (MICMtb = 1.25 µM) and low cytotoxicity against mouse embryo fibroblast NIH-3T3 cells. Our research provided an effective strategy to discover novel anti-TB lead compounds.


Assuntos
Antituberculosos , Mycobacterium tuberculosis , Animais , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Proteínas de Bactérias , Camundongos , Modelos Moleculares
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...