Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 462
Filtrar
1.
Brief Bioinform ; 25(6)2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39327890

RESUMO

Hitherto virtual screening (VS) has been typically performed using a structure-based drug design paradigm. Such methods typically require the use of molecular docking on high-resolution three-dimensional structures of a target protein-a computationally-intensive and time-consuming exercise. This work demonstrates that by employing protein language models and molecular graphs as inputs to a novel graph-to-transformer cross-attention mechanism, a screening power comparable to state-of-the-art structure-based models can be achieved. The implications thereof include highly expedited VS due to the greatly reduced compute required to run this model, and the ability to perform early stages of computer-aided drug design in the complete absence of 3D protein structures.


Assuntos
Proteínas , Proteínas/química , Desenho de Fármacos , Simulação de Acoplamento Molecular , Modelos Moleculares , Conformação Proteica
2.
RNA ; 29(4): 473-488, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36693763

RESUMO

RNA structures regulate a wide range of processes in biology and disease, yet small molecule chemical probes or drugs that can modulate these functions are rare. Machine learning and other computational methods are well poised to fill gaps in knowledge and overcome the inherent challenges in RNA targeting, such as the dynamic nature of RNA and the difficulty of obtaining RNA high-resolution structures. Successful tools to date include principal component analysis, linear discriminate analysis, k-nearest neighbor, artificial neural networks, multiple linear regression, and many others. Employment of these tools has revealed critical factors for selective recognition in RNA:small molecule complexes, predictable differences in RNA- and protein-binding ligands, and quantitative structure activity relationships that allow the rational design of small molecules for a given RNA target. Herein we present our perspective on the value of using machine learning and other computation methods to advance RNA:small molecule targeting, including select examples and their validation as well as necessary and promising future directions that will be key to accelerate discoveries in this important field.


Assuntos
Aprendizado de Máquina , RNA , RNA/genética , RNA/química , Redes Neurais de Computação
3.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34498670

RESUMO

With the consolidation of deep learning in drug discovery, several novel algorithms for learning molecular representations have been proposed. Despite the interest of the community in developing new methods for learning molecular embeddings and their theoretical benefits, comparing molecular embeddings with each other and with traditional representations is not straightforward, which in turn hinders the process of choosing a suitable representation for Quantitative Structure-Activity Relationship (QSAR) modeling. A reason behind this issue is the difficulty of conducting a fair and thorough comparison of the different existing embedding approaches, which requires numerous experiments on various datasets and training scenarios. To close this gap, we reviewed the literature on methods for molecular embeddings and reproduced three unsupervised and two supervised molecular embedding techniques recently proposed in the literature. We compared these five methods concerning their performance in QSAR scenarios using different classification and regression datasets. We also compared these representations to traditional molecular representations, namely molecular descriptors and fingerprints. As opposed to the expected outcome, our experimental setup consisting of over $25 000$ trained models and statistical tests revealed that the predictive performance using molecular embeddings did not significantly surpass that of traditional representations. Although supervised embeddings yielded competitive results compared with those using traditional molecular representations, unsupervised embeddings tended to perform worse than traditional representations. Our results highlight the need for conducting a careful comparison and analysis of the different embedding techniques prior to using them in drug design tasks and motivate a discussion about the potential of molecular embeddings in computer-aided drug design.


Assuntos
Algoritmos , Relação Quantitativa Estrutura-Atividade
4.
Bioorg Med Chem Lett ; 103: 129690, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38447786

RESUMO

Autotaxin is a secreted lysophospholipase D which is a member of the ectonucleotide pyrophosphatase/phosphodiesterase family converting extracellular lysophosphatidylcholine and other non-choline lysophospholipids, such as lysophosphatidylethanolamine and lysophosphatidylserine, to the lipid mediator lysophosphatidic acid. Autotaxin is implicated in various fibroproliferative diseases including interstitial lung diseases, such as idiopathic pulmonary fibrosis and hepatic fibrosis, as well as in cancer. In this study, we present an effort of identifying ATX inhibitors that bind to allosteric ATX binding sites using the Enalos Asclepios KNIME Node. All the available PDB crystal structures of ATX were collected, prepared, and aligned. Visual examination of these structures led to the identification of four crystal structures of human ATX co-crystallized with four known inhibitors. These inhibitors bind to five binding sites with five different binding modes. These five binding sites were thereafter used to virtually screen a compound library of 14,000 compounds to identify molecules that bind to allosteric sites. Based on the binding mode and interactions, the docking score, and the frequency that a compound comes up as a top-ranked among the five binding sites, 24 compounds were selected for in vitro testing. Finally, two compounds emerged with inhibitory activity against ATX in the low micromolar range, while their mode of inhibition and binding pattern were also studied. The two derivatives identified herein can serve as "hits" towards developing novel classes of ATX allosteric inhibitors.


Assuntos
Lisofosfolipídeos , Neoplasias , Humanos , Lisofosfolipídeos/química , Lisofosfolipídeos/metabolismo , Diester Fosfórico Hidrolases/metabolismo , Neoplasias/metabolismo , Sítios de Ligação , Sítio Alostérico
5.
J Chem Inf Model ; 64(11): 4392-4409, 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38815246

RESUMO

By accelerating time-consuming processes with high efficiency, computing has become an essential part of many modern chemical pipelines. Machine learning is a class of computing methods that can discover patterns within chemical data and utilize this knowledge for a wide variety of downstream tasks, such as property prediction or substance generation. The complex and diverse chemical space requires complex machine learning architectures with great learning power. Recently, learning models based on transformer architectures have revolutionized multiple domains of machine learning, including natural language processing and computer vision. Naturally, there have been ongoing endeavors in adopting these techniques to the chemical domain, resulting in a surge of publications within a short period. The diversity of chemical structures, use cases, and learning models necessitate a comprehensive summarization of existing works. In this paper, we review recent innovations in adapting transformers to solve learning problems in chemistry. Because chemical data is diverse and complex, we structure our discussion based on chemical representations. Specifically, we highlight the strengths and weaknesses of each representation, the current progress of adapting transformer architectures, and future directions.


Assuntos
Quimioinformática , Aprendizado de Máquina , Quimioinformática/métodos
6.
Environ Sci Technol ; 58(9): 4181-4192, 2024 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-38373301

RESUMO

Alzheimer's disease (AD) is a complex and multifactorial neurodegenerative disease, which is currently diagnosed via clinical symptoms and nonspecific biomarkers (such as Aß1-42, t-Tau, and p-Tau) measured in cerebrospinal fluid (CSF), which alone do not provide sufficient insights into disease progression. In this pilot study, these biomarkers were complemented with small-molecule analysis using non-target high-resolution mass spectrometry coupled with liquid chromatography (LC) on the CSF of three groups: AD, mild cognitive impairment (MCI) due to AD, and a non-demented (ND) control group. An open-source cheminformatics pipeline based on MS-DIAL and patRoon was enhanced using CSF- and AD-specific suspect lists to assist in data interpretation. Chemical Similarity Enrichment Analysis revealed a significant increase of hydroxybutyrates in AD, including 3-hydroxybutanoic acid, which was found at higher levels in AD compared to MCI and ND. Furthermore, a highly sensitive target LC-MS method was used to quantify 35 bile acids (BAs) in the CSF, revealing several statistically significant differences including higher dehydrolithocholic acid levels and decreased conjugated BA levels in AD. This work provides several promising small-molecule hypotheses that could be used to help track the progression of AD in CSF samples.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Doenças Neurodegenerativas , Humanos , Doença de Alzheimer/líquido cefalorraquidiano , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/psicologia , Proteínas tau/líquido cefalorraquidiano , Peptídeos beta-Amiloides/líquido cefalorraquidiano , Projetos Piloto , Disfunção Cognitiva/líquido cefalorraquidiano , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/psicologia , Biomarcadores , Progressão da Doença
7.
Anal Bioanal Chem ; 416(12): 2951-2968, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38507043

RESUMO

Quantitative structure-retention relationship (QSRR) modeling has emerged as an efficient alternative to predict analyte retention times using molecular descriptors. However, most reported QSRR models are column-specific, requiring separate models for each high-performance liquid chromatography (HPLC) system. This study evaluates the potential of machine learning (ML) algorithms and quantum mechanical (QM) descriptors to develop QSRR models that can predict retention times across three different reversed-phase HPLC columns under varying conditions. Four machine learning methods-partial least squares (PLS) regression, ridge regression (RR), random forest (RF), and gradient boosting (GB)-were compared on a dataset of 360 retention times for 15 aromatic analytes. Molecular descriptors were calculated using density functional theory (DFT). Column characteristics like particle size and pore size and experimental conditions like temperature and gradient time were additionally used as descriptors. Results showed that the GB-QSRR model demonstrated the best predictive performance, with Q2 of 0.989 and root mean square error of prediction (RMSEP) of 0.749 min on the test set. Feature analysis revealed that solvation energy (SE), HOMO-LUMO energy gap (∆E HOMO-LUMO), total dipole moment (Mtot), and global hardness (η) are among the most influential predictors for retention time prediction, indicating the significance of electrostatic interactions and hydrophobicity. Our findings underscore the efficiency of ensemble methods, GB and RF models employing non-linear learners, in capturing local variations in retention times across diverse experimental setups. This study emphasizes the potential of cross-column QSRR modeling and highlights the utility of ML models in optimizing chromatographic analysis.

8.
Antonie Van Leeuwenhoek ; 117(1): 55, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38488950

RESUMO

Antimicrobial peptides (AMPs) are promising cationic and amphipathic molecules to fight antibiotic resistance. To search for novel AMPs, we applied a computational strategy to identify peptide sequences within the organisms' proteome, including in-house developed software and artificial intelligence tools. After analyzing 150.450 proteins from eight proteomes of bacteria, plants, a protist, and a nematode, nine peptides were selected and modified to increase their antimicrobial potential. The 18 resulting peptides were validated by bioassays with four pathogenic bacterial species, one yeast species, and two cancer cell-lines. Fourteen of the 18 tested peptides were antimicrobial, with minimum inhibitory concentrations (MICs) values under 10 µM against at least three bacterial species; seven were active against Candida albicans with MICs values under 10 µM; six had a therapeutic index above 20; two peptides were active against A549 cells, and eight were active against MCF-7 cells under 30 µM. This study's most active antimicrobial peptides damage the bacterial cell membrane, including grooves, dents, membrane wrinkling, cell destruction, and leakage of cytoplasmic material. The results confirm that the proposed approach, which uses bioinformatic tools and rational modifications, is highly efficient and allows the discovery, with high accuracy, of potent AMPs encrypted in proteins.


Assuntos
Anti-Infecciosos , Proteoma , Peptídeos Catiônicos Antimicrobianos/farmacologia , Peptídeos Catiônicos Antimicrobianos/química , Peptídeos Antimicrobianos , Inteligência Artificial , Anti-Infecciosos/farmacologia , Anti-Infecciosos/química , Bactérias , Testes de Sensibilidade Microbiana , Antibacterianos/farmacologia
9.
Chem Pharm Bull (Tokyo) ; 72(9): 794-799, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39218704

RESUMO

Recently, remarkable progress has been achieved in artificial intelligence (AI), including machine learning. Various AI models have been proposed for drug discovery, including the design of small molecules, activity prediction, and three-dimensional (3D) structure prediction of proteins. AI consists of diverse elements, including information retrieval and machine learning, and can be used in a wide range of drug discovery scenarios. In this review, we focused on AI for small-molecule drug discovery with respect to molecular design, activity prediction, and prediction of the binding poses of compounds to target molecules. We also discussed the applications of AI in academic drug discovery.


Assuntos
Inteligência Artificial , Quimioinformática , Descoberta de Drogas , Humanos , Aprendizado de Máquina , Bibliotecas de Moléculas Pequenas/química
10.
Int J Mol Sci ; 25(8)2024 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-38673888

RESUMO

Urease, a pivotal enzyme in nitrogen metabolism, plays a crucial role in various microorganisms, including the pathogenic Helicobacter pylori. Inhibiting urease activity offers a promising approach to combating infections and associated ailments, such as chronic kidney diseases and gastric cancer. However, identifying potent urease inhibitors remains challenging due to resistance issues that hinder traditional approaches. Recently, machine learning (ML)-based models have demonstrated the ability to predict the bioactivity of molecules rapidly and effectively. In this study, we present ML models designed to predict urease inhibitors by leveraging essential physicochemical properties. The methodological approach involved constructing a dataset of urease inhibitors through an extensive literature search. Subsequently, these inhibitors were characterized based on physicochemical properties calculations. An exploratory data analysis was then conducted to identify and analyze critical features. Ultimately, 252 classification models were trained, utilizing a combination of seven ML algorithms, three attribute selection methods, and six different strategies for categorizing inhibitory activity. The investigation unveiled discernible trends distinguishing urease inhibitors from non-inhibitors. This differentiation enabled the identification of essential features that are crucial for precise classification. Through a comprehensive comparison of ML algorithms, tree-based methods like random forest, decision tree, and XGBoost exhibited superior performance. Additionally, incorporating the "chemical family type" attribute significantly enhanced model accuracy. Strategies involving a gray-zone categorization demonstrated marked improvements in predictive precision. This research underscores the transformative potential of ML in predicting urease inhibitors. The meticulous methodology outlined herein offers actionable insights for developing robust predictive models within biochemical systems.


Assuntos
Inibidores Enzimáticos , Aprendizado de Máquina , Urease , Urease/antagonistas & inibidores , Urease/química , Urease/metabolismo , Inibidores Enzimáticos/química , Inibidores Enzimáticos/farmacologia , Helicobacter pylori/enzimologia , Helicobacter pylori/efeitos dos fármacos , Algoritmos , Humanos
11.
Int J Mol Sci ; 25(8)2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38673742

RESUMO

Artificial neural networks (ANNs) are nowadays applied as the most efficient methods in the majority of machine learning approaches, including data-driven modeling for assessment of the toxicity of chemicals. We developed a combined neural network methodology that can be used in the scope of new approach methodologies (NAMs) assessing chemical or drug toxicity. Here, we present QSAR models for predicting the physical and biochemical properties of molecules of three different datasets: aqueous solubility, acute fish toxicity toward fat head minnow, and bio-concentration factors. A novel neural network modeling method is developed by combining two neural network algorithms, namely, the counter-propagation modeling strategy (CP-ANN) with the back-propagation-of-errors algorithm (BPE-ANN). The advantage is a short training time, robustness, and good interpretability through the initial CP-ANN part, while the extension with BPE-ANN improves the precision of predictions in the range between minimal and maximal property values of the training data, regardless of the number of neurons in both neural networks, either CP-ANN or BPE-ANN.


Assuntos
Algoritmos , Redes Neurais de Computação , Animais , Relação Quantitativa Estrutura-Atividade , Aprendizado de Máquina
12.
Molecules ; 29(1)2024 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-38202859

RESUMO

MolOptimizer is a user-friendly computational toolkit designed to streamline the hit-to-lead optimization process in drug discovery. MolOptimizer extracts features and trains machine learning models using a user-provided, labeled, and small-molecule dataset to accurately predict the binding values of new small molecules that share similar scaffolds with the target in focus. Hosted on the Azure web-based server, MolOptimizer emerges as a vital resource, accelerating the discovery and development of novel drug candidates with improved binding properties.


Assuntos
Desenho de Fármacos , Descoberta de Drogas , Aprendizado de Máquina
13.
Molecules ; 29(12)2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38930871

RESUMO

Synthetic efforts toward complex natural product (NP) scaffolds are useful ones, particularly those aimed at expanding their bioactive chemical space. Here, we utilised an orthogonal cheminformatics-based approach to predict the potential biological activities for a series of synthetic bis-indole alkaloids inspired by elusive sponge-derived NPs, echinosulfone A (1) and echinosulfonic acids A-D (2-5). Our work includes the first synthesis of desulfato-echinosulfonic acid C, an α-hydroxy bis(3'-indolyl) alkaloid (17), and its full NMR characterisation. This synthesis provides corroborating evidence for the structure revision of echinosulfonic acids A-C. Additionally, we demonstrate a robust synthetic strategy toward a diverse range of α-methine bis(3'-indolyl) acids and acetates (11-16) without the need for silica-based purification in either one or two steps. By integrating our synthetic library of bis-indoles with bioactivity data for 2048 marine indole alkaloids (reported up to the end of 2021), we analyzed their overlap with marine natural product chemical diversity. Notably, the C-6 dibrominated α-hydroxy bis(3'-indolyl) and α-methine bis(3'-indolyl) analogues (11, 14, and 17) were found to contain significant overlap with antibacterial C-6 dibrominated marine bis-indoles, guiding our biological evaluation. Validating the results of our cheminformatics analyses, the dibrominated α-methine bis(3'-indolyl) alkaloids (11, 12, 14, and 15) were found to exhibit antibacterial activities against methicillin-sensitive and -resistant Staphylococcus aureus. Further, while investigating other synthetic approaches toward bis-indole alkaloids, 16 incorrectly assigned synthetic α-hydroxy bis(3'-indolyl) alkaloids were identified. After careful analysis of their reported NMR data, and comparison with those obtained for the synthetic bis-indoles reported herein, all of the structures have been revised to α-methine bis(3'-indolyl) alkaloids.


Assuntos
Antibacterianos , Quimioinformática , Alcaloides Indólicos , Antibacterianos/farmacologia , Antibacterianos/química , Antibacterianos/síntese química , Alcaloides Indólicos/química , Alcaloides Indólicos/farmacologia , Alcaloides Indólicos/síntese química , Quimioinformática/métodos , Testes de Sensibilidade Microbiana , Estrutura Molecular , Relação Estrutura-Atividade , Produtos Biológicos/química , Produtos Biológicos/farmacologia , Produtos Biológicos/síntese química
14.
Molecules ; 29(8)2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38675645

RESUMO

In the realm of predictive toxicology for small molecules, the applicability domain of QSAR models is often limited by the coverage of the chemical space in the training set. Consequently, classical models fail to provide reliable predictions for wide classes of molecules. However, the emergence of innovative data collection methods such as intensive hackathons have promise to quickly expand the available chemical space for model construction. Combined with algorithmic refinement methods, these tools can address the challenges of toxicity prediction, enhancing both the robustness and applicability of the corresponding models. This study aimed to investigate the roles of gradient boosting and strategic data aggregation in enhancing the predictivity ability of models for the toxicity of small organic molecules. We focused on evaluating the impact of incorporating fragment features and expanding the chemical space, facilitated by a comprehensive dataset procured in an open hackathon. We used gradient boosting techniques, accounting for critical features such as the structural fragments or functional groups often associated with manifestations of toxicity.


Assuntos
Algoritmos , Relação Quantitativa Estrutura-Atividade , Toxicologia/métodos , Humanos
15.
Molecules ; 29(12)2024 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-38930883

RESUMO

Intracellular tau fibrils are sources of neurotoxicity and oxidative stress in Alzheimer's. Current drug discovery efforts have focused on molecules with tau fibril disaggregation and antioxidation functions. However, recent studies suggest that membrane-bound tau-containing oligomers (mTCOs), smaller and less ordered than tau fibrils, are neurotoxic in the early stage of Alzheimer's. Whether tau fibril-targeting molecules are effective against mTCOs is unknown. The binding of epigallocatechin-3-gallate (EGCG), CNS-11, and BHT-CNS-11 to in silico mTCOs and experimental tau fibrils was investigated using machine learning-enhanced docking and molecular dynamics simulations. EGCG and CNS-11 have tau fibril disaggregation functions, while the proposed BHT-CNS-11 has potential tau fibril disaggregation and antioxidation functions like EGCG. Our results suggest that the three molecules studied may also bind to mTCOs. The predicted binding probability of EGCG to mTCOs increases with the protein aggregate size. In contrast, the predicted probability of CNS-11 and BHT-CNS-11 binding to the dimeric mTCOs is higher than binding to the tetrameric mTCOs for the homo tau but not for the hetero tau-amylin oligomers. Our results also support the idea that anionic lipids may promote the binding of molecules to mTCOs. We conclude that tau fibril-disaggregating and antioxidating molecules may bind to mTCOs, and that mTCOs may also be useful targets for Alzheimer's drug design.


Assuntos
Antioxidantes , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Ligação Proteica , Proteínas tau , Proteínas tau/metabolismo , Proteínas tau/química , Humanos , Antioxidantes/química , Antioxidantes/farmacologia , Amiloide/química , Amiloide/metabolismo , Catequina/análogos & derivados , Catequina/química , Catequina/metabolismo , Catequina/farmacologia , Agregados Proteicos
16.
Molecules ; 29(15)2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39125052

RESUMO

Marine natural products (MNPs) continue to be tested primarily in cellular toxicity assays, both mammalian and microbial, despite most being inactive at concentrations relevant to drug discovery. These MNPs become missed opportunities and represent a wasteful use of precious bioresources. The use of cheminformatics aligned with published bioactivity data can provide insights to direct the choice of bioassays for the evaluation of new MNPs. Cheminformatics analysis of MNPs found in MarinLit (n = 39,730) up to the end of 2023 highlighted indol-3-yl-glyoxylamides (IGAs, n = 24) as a group of MNPs with no reported bioactivities. However, a recent review of synthetic IGAs highlighted these scaffolds as privileged structures with several compounds under clinical evaluation. Herein, we report the synthesis of a library of 32 MNP-inspired brominated IGAs (25-56) using a simple one-pot, multistep method affording access to these diverse chemical scaffolds. Directed by a meta-analysis of the biological activities reported for marine indole alkaloids (MIAs) and synthetic IGAs, the brominated IGAs 25-56 were examined for their potential bioactivities against the Parkinson's Disease amyloid protein alpha synuclein (α-syn), antiplasmodial activities against chloroquine-resistant (3D7) and sensitive (Dd2) parasite strains of Plasmodium falciparum, and inhibition of mammalian (chymotrypsin and elastase) and viral (SARS-CoV-2 3CLpro) proteases. All of the synthetic IGAs tested exhibited binding affinity to the amyloid protein α-syn, while some showed inhibitory activities against P. falciparum, and the proteases, SARS-CoV-2 3CLpro, and chymotrypsin. The cellular safety of the IGAs was examined against cancerous and non-cancerous human cell lines, with all of the compounds tested inactive, thereby validating cheminformatics and meta-analyses results. The findings presented herein expand our knowledge of marine IGA bioactive chemical space and advocate expanding the scope of biological assays routinely used to investigate NP bioactivities, specifically those more suitable for non-toxic compounds. By integrating cheminformatics tools and functional assays into NP biological testing workflows, we can aim to enhance the potential of NPs and their scaffolds for future drug discovery and development.


Assuntos
Produtos Biológicos , Quimioinformática , Descoberta de Drogas , Produtos Biológicos/química , Produtos Biológicos/farmacologia , Humanos , Quimioinformática/métodos , SARS-CoV-2/efeitos dos fármacos , Organismos Aquáticos/química , Indóis/química , Indóis/farmacologia , Plasmodium falciparum/efeitos dos fármacos , Alcaloides Indólicos/farmacologia , Alcaloides Indólicos/química , Animais
17.
Mol Biol Evol ; 39(4)2022 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-35298643

RESUMO

Countless reports describe the isolation and structural characterization of natural products, yet this information remains disconnected and underutilized. Using a cheminformatics approach, we leverage the reported observations of iridoid glucosides with the known phylogeny of a large iridoid producing plant family (Lamiaceae) to generate a set of biosynthetic pathways that best explain the extant iridoid chemical diversity. We developed a pathway reconstruction algorithm that connects iridoid reports via reactions and prunes this solution space by considering phylogenetic relationships between genera. We formulate a model that emulates the evolution of iridoid glucosides to create a synthetic data set, used to select the parameters that would best reconstruct the pathways, and apply them to the iridoid data set to generate pathway hypotheses. These computationally generated pathways were then used as the basis by which to select and screen biosynthetic enzyme candidates. Our model was successfully applied to discover a cytochrome P450 enzyme from Callicarpa americana that catalyzes the oxidation of bartsioside to aucubin, predicted by our model despite neither molecule having been observed in the genus. We also demonstrate aucubin synthase activity in orthologues of Vitex agnus-castus, and the outgroup Paulownia tomentosa, further strengthening the hypothesis, enabled by our model, that the reaction was present in the ancestral biosynthetic pathway. This is the first systematic hypothesis on the epi-iridoid glucosides biosynthesis in 25 years and sets the stage for streamlined work on the iridoid pathway. This work highlights how curation and computational analysis of widely available structural data can facilitate hypothesis-based gene discovery.


Assuntos
Glucosídeos Iridoides , Lamiaceae , Quimioinformática , Glucosídeos Iridoides/química , Glucosídeos Iridoides/metabolismo , Iridoides/metabolismo , Lamiaceae/genética , Lamiaceae/metabolismo , Filogenia
18.
J Comput Chem ; 44(1): 27-42, 2023 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-36239971

RESUMO

Algorithms that automatically explore the chemical space have been limited to chemical systems with a low number of atoms due to expensive involved quantum calculations and the large amount of possible reaction pathways. The method described here presents a novel solution to the problem of chemical exploration by generating reaction networks with heuristics based on chemical theory. First, a second version of the reaction network is determined through molecular graph transformations acting upon functional groups of the reacting. Only transformations that break two chemical bonds and form two new ones are considered, leading to a significant performance enhancement compared to previously presented algorithm. Second, energy barriers for this reaction network are estimated through quantum chemical calculations by a growing string method, which can also identify non-octet species missed during the previous step and further define the reaction network. The proposed algorithm has been successfully applied to five different chemical reactions, in all cases identifying the most important reaction pathways.

19.
J Mol Recognit ; 36(10): e3055, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37658788

RESUMO

COVID-19 was a global pandemic in the year 2020. Several treatment options failed to cure the disease. Thus, plant-based medicines are becoming a trend nowadays due to their less side effects. Bioactive chemicals from natural sources have been utilised for centuries as treatment options for a variety of ailments. To find out the potent bioactive compounds to counteract COVID-19, we use systems pharmacology and cheminformatics. They use the definitive data and predict the possible outcomes. In this study, we collected a total of 72 phytocompounds from the medicinally important plants such as Garcinia mangostana and Cinnamomum verum, of which 13 potential phytocompounds were identified to be active against the COVID-19 infection based on Swiss Target Prediction and compound target network analysis. These phytocompounds were annotated to identify the specific human receptor that targets COVID-19-specific genes such as MAPK8, MAPK14, ACE, CYP3A4, TLR4 and TYK2. Among these, compounds such as smeathxanthone A, demethylcalabaxanthone, mangostanol, trapezifolixanthone from Garcinia mangostana and camphene from C. verum were putatively target various COVID-19-related genes. Molecular docking results showed that smeathxanthone A and demethylcalabaxanthone exhibit increased binding efficiency towards the COVID-19-related receptor proteins. These compounds also showed efficient putative pharmacoactive properties than the commercial drugs ((R)-remdesivir, favipiravir and hydroxychloroquine) used to cure COVID-19. In conclusion, our study highlights the use of cheminformatics approach to unravel the potent and novel phytocompounds against COVID-19. These phytocompounds may be safer to use, more efficient and less harmful. This study highlights the value of natural products in the search for new drugs and identifies candidates with great promise.


Assuntos
COVID-19 , Quimioinformática , Humanos , Farmacologia em Rede , Simulação de Acoplamento Molecular
20.
Brief Bioinform ; 22(1): 474-484, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31885044

RESUMO

BACKGROUND: With the increasing development of biotechnology and information technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these resources needs to be extracted and then transformed to useful knowledge by various data mining methods. However, a main computational challenge is how to effectively represent or encode molecular objects under investigation such as chemicals, proteins, DNAs and even complicated interactions when data mining methods are employed. To further explore these complicated data, an integrated toolkit to represent different types of molecular objects and support various data mining algorithms is urgently needed. RESULTS: We developed a freely available R/CRAN package, called BioMedR, for molecular representations of chemicals, proteins, DNAs and pairwise samples of their interactions. The current version of BioMedR could calculate 293 molecular descriptors and 13 kinds of molecular fingerprints for small molecules, 9920 protein descriptors based on protein sequences and six types of generalized scale-based descriptors for proteochemometric modeling, more than 6000 DNA descriptors from nucleotide sequences and six types of interaction descriptors using three different combining strategies. Moreover, this package realized five similarity calculation methods and four powerful clustering algorithms as well as several useful auxiliary tools, which aims at building an integrated analysis pipeline for data acquisition, data checking, descriptor calculation and data modeling. CONCLUSION: BioMedR provides a comprehensive and uniform R package to link up different representations of molecular objects with each other and will benefit cheminformatics/bioinformatics and other biomedical users. It is available at: https://CRAN.R-project.org/package=BioMedR and https://github.com/wind22zhu/BioMedR/.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Gerenciamento de Dados/métodos , Bases de Dados de Compostos Químicos , Bases de Dados Genéticas , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA