Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 353
Filtrar
1.
PLoS One ; 19(7): e0306202, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38968199

RESUMO

Chemical information has become increasingly ubiquitous and has outstripped the pace of analysis and interpretation. We have developed an R package, uafR, that automates a grueling retrieval process for gas -chromatography coupled mass spectrometry (GC -MS) data and allows anyone interested in chemical comparisons to quickly perform advanced structural similarity matches. Our streamlined cheminformatics workflows allow anyone with basic experience in R to pull out component areas for tentative compound identifications using the best published understanding of molecules across samples (pubchem.gov). Interpretations can now be done at a fraction of the time, cost, and effort it would typically take using a standard chemical ecology data analysis pipeline. The package was tested in two experimental contexts: (1) A dataset of purified internal standards, which showed our algorithms correctly identified the known compounds with R2 values ranging from 0.827-0.999 along concentrations ranging from 1 × 10-5 to 1 × 103 ng/µl, (2) A large, previously published dataset, where the number and types of compounds identified were comparable (or identical) to those identified with the traditional manual peak annotation process, and NMDS analysis of the compounds produced the same pattern of significance as in the original study. Both the speed and accuracy of GC -MS data processing are drastically improved with uafR because it allows users to fluidly interact with their experiment following tentative library identifications [i.e. after the m/z spectra have been matched against an installed chemical fragmentation database (e.g. NIST)]. Use of uafR will allow larger datasets to be collected and systematically interpreted quickly. Furthermore, the functions of uafR could allow backlogs of previously collected and annotated data to be processed by new personnel or students as they are being trained. This is critical as we enter the era of exposomics, metabolomics, volatilomes, and landscape level, high-throughput chemotyping. This package was developed to advance collective understanding of chemical data and is applicable to any research that benefits from GC -MS analysis. It can be downloaded for free along with sample datasets from Github at github.org/castratton/uafR or installed directly from R or RStudio using the developer tools: 'devtools::install_github("castratton/uafR")'.


Assuntos
Algoritmos , Cromatografia Gasosa-Espectrometria de Massas , Software , Cromatografia Gasosa-Espectrometria de Massas/métodos , Quimioinformática/métodos
2.
J Chem Inf Model ; 64(14): 5521-5534, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-38950894

RESUMO

Information extraction from chemistry literature is vital for constructing up-to-date reaction databases for data-driven chemistry. Complete extraction requires combining information across text, tables, and figures, whereas prior work has mainly investigated extracting reactions from single modalities. In this paper, we present OpenChemIE to address this complex challenge and enable the extraction of reaction data at the document level. OpenChemIE approaches the problem in two steps: extracting relevant information from individual modalities and then integrating the results to obtain a final list of reactions. For the first step, we employ specialized neural models that each address a specific task for chemistry information extraction, such as parsing molecules or reactions from text or figures. We then integrate the information from these modules using chemistry-informed algorithms, allowing for the extraction of fine-grained reaction data from reaction condition and substrate scope investigations. Our machine learning models attain state-of-the-art performance when evaluated individually, and we meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole, achieving an F1 score of 69.5%. Additionally, the reaction extraction results of OpenChemIE attain an accuracy score of 64.3% when directly compared against the Reaxys chemical database. OpenChemIE is most suited for information extraction on organic chemistry literature, where molecules are generally depicted as planar graphs or written in text and can be consolidated into a SMILES format. We provide OpenChemIE freely to the public as an open-source package, as well as through a web interface.


Assuntos
Aprendizado de Máquina , Mineração de Dados/métodos , Bases de Dados de Compostos Químicos , Algoritmos , Quimioinformática/métodos
3.
J Chem Inf Model ; 64(14): 5570-5579, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-38958581

RESUMO

One of the most challenging tasks in modern medicine is to find novel efficient cancer therapeutic methods with minimal side effects. The recent discovery of several classes of organic molecules known as "molecular jackhammers" is a promising development in this direction. It is known that these molecules can directly target and eliminate cancer cells with no impact on healthy tissues. However, the underlying microscopic picture remains poorly understood. We present a study that utilizes theoretical analysis together with experimental measurements to clarify the microscopic aspects of jackhammers' anticancer activities. Our physical-chemical approach combines statistical analysis with chemoinformatics methods to design and optimize molecular jackhammers. By correlating specific physical-chemical properties of these molecules with their abilities to kill cancer cells, several important structural features are identified and discussed. Although our theoretical analysis enhances understanding of the molecular interactions of jackhammers, it also highlights the need for further research to comprehensively elucidate their mechanisms and to develop a robust physical-chemical framework for the rational design of targeted anticancer drugs.


Assuntos
Antineoplásicos , Quimioinformática , Humanos , Antineoplásicos/farmacologia , Antineoplásicos/química , Quimioinformática/métodos , Neoplasias/tratamento farmacológico , Neoplasias/patologia , Linhagem Celular Tumoral , Modelos Moleculares
4.
Mol Inform ; 43(7): e202400052, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38994633

RESUMO

Compound databases of natural products play a crucial role in drug discovery and development projects and have implications in other areas, such as food chemical research, ecology and metabolomics. Recently, we put together the first version of the Latin American Natural Product database (LANaPDB) as a collective effort of researchers from six countries to ensemble a public and representative library of natural products in a geographical region with a large biodiversity. The present work aims to conduct a comparative and extensive profiling of the natural product-likeness of an updated version of LANaPDB and the individual ten compound databases that form part of LANaPDB. The natural product-likeness profile of the Latin American compound databases is contrasted with the profile of other major natural product databases in the public domain and a set of small-molecule drugs approved for clinical use. As part of the extensive characterization, we employed several chemoinformatics metrics of natural product likeness. The results of this study will capture the attention of the global community engaged in natural product databases, not only in Latin America but across the world.


Assuntos
Produtos Biológicos , Produtos Biológicos/química , Produtos Biológicos/farmacologia , América Latina , Bibliotecas de Moléculas Pequenas/farmacologia , Bibliotecas de Moléculas Pequenas/química , Descoberta de Drogas , Quimioinformática , Bases de Dados de Compostos Químicos
5.
J Chem Inf Model ; 64(14): 5451-5469, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-38949069

RESUMO

This study addresses the challenge of accurately identifying stereoisomers in cheminformatics, which originates from our objective to apply machine learning to predict the association constant between cyclodextrin and a guest. Identifying stereoisomers is indeed crucial for machine learning applications. Current tools offer various molecular descriptors, including their textual representation as Isomeric SMILES that can distinguish stereoisomers. However, such representation is text-based and does not have a fixed size, so a conversion is needed to make it usable to machine learning approaches. Word embedding techniques can be used to solve this problem. Mol2vec, a word embedding approach for molecules, offers such a conversion. Unfortunately, it cannot distinguish between stereoisomers due to its inability to capture the spatial configuration of molecular structures. This study proposes several approaches that use word embedding techniques to handle molecular discrimination using stereochemical information on molecules or considering Isomeric SMILES notation as a text in Natural Language Processing. Our aim is to generate a distinct vector for each unique molecule, correctly identifying stereoisomer information in cheminformatics. The proposed approaches are then compared to our original machine learning task: predicting the association constant between cyclodextrin and a guest molecule.


Assuntos
Aprendizado de Máquina , Estereoisomerismo , Quimioinformática/métodos , Ciclodextrinas/química , Processamento de Linguagem Natural
6.
PLoS One ; 19(6): e0302105, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38889115

RESUMO

The present study was focused on exploring the efficient inhibitors of closed state (form) of type III effector Xanthomonas outer protein Q (XopQ) (PDB: 4P5F) from the 44 phytochemicals of Picrasma quassioides using cutting-edge computational analysis. Among them, Kumudine B showed excellent binding energy (-11.0 kcal/mol), followed by Picrasamide A, Quassidine I and Quassidine J with the targeted closed state of XopQ protein compared to the reference standard drug (Streptomycin). The molecular dynamics (MD) simulations performed at 300 ns validated the stability of top lead ligands (Kumudine B, Picrasamide A, and Quassidine I)-bound XopQ protein complex with slightly lower fluctuation than Streptomycin. The MM-PBSA calculation confirmed the strong interactions of top lead ligands (Kumudine B and QuassidineI) with XopQ protein, as they offered the least binding energy. The results of absorption, distribution, metabolism, excretion, and toxicity (ADMET) analysis confirmed that Quassidine I, Kumudine B and Picrasamide A were found to qualify most of the drug-likeness rules with excellent bioavailability scores compared to Streptomycin. Results of the computational studies suggested that Kumudine B, Picrasamide A, and Quassidine I could be considered potential compounds to design novel antibacterial drugs against X. oryzae infection. Further in vitro and in vivo antibacterial activities of Kumudine B, Picrasamide A, and Quassidine I are required to confirm their therapeutic potentiality in controlling the X. oryzae infection.


Assuntos
Antibacterianos , Simulação de Dinâmica Molecular , Xanthomonas , Antibacterianos/farmacologia , Antibacterianos/química , Xanthomonas/efeitos dos fármacos , Quimioinformática/métodos , Simulação de Acoplamento Molecular , Proteínas de Bactérias/antagonistas & inibidores , Proteínas de Bactérias/metabolismo , Proteínas de Bactérias/química
7.
Molecules ; 29(12)2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38930871

RESUMO

Synthetic efforts toward complex natural product (NP) scaffolds are useful ones, particularly those aimed at expanding their bioactive chemical space. Here, we utilised an orthogonal cheminformatics-based approach to predict the potential biological activities for a series of synthetic bis-indole alkaloids inspired by elusive sponge-derived NPs, echinosulfone A (1) and echinosulfonic acids A-D (2-5). Our work includes the first synthesis of desulfato-echinosulfonic acid C, an α-hydroxy bis(3'-indolyl) alkaloid (17), and its full NMR characterisation. This synthesis provides corroborating evidence for the structure revision of echinosulfonic acids A-C. Additionally, we demonstrate a robust synthetic strategy toward a diverse range of α-methine bis(3'-indolyl) acids and acetates (11-16) without the need for silica-based purification in either one or two steps. By integrating our synthetic library of bis-indoles with bioactivity data for 2048 marine indole alkaloids (reported up to the end of 2021), we analyzed their overlap with marine natural product chemical diversity. Notably, the C-6 dibrominated α-hydroxy bis(3'-indolyl) and α-methine bis(3'-indolyl) analogues (11, 14, and 17) were found to contain significant overlap with antibacterial C-6 dibrominated marine bis-indoles, guiding our biological evaluation. Validating the results of our cheminformatics analyses, the dibrominated α-methine bis(3'-indolyl) alkaloids (11, 12, 14, and 15) were found to exhibit antibacterial activities against methicillin-sensitive and -resistant Staphylococcus aureus. Further, while investigating other synthetic approaches toward bis-indole alkaloids, 16 incorrectly assigned synthetic α-hydroxy bis(3'-indolyl) alkaloids were identified. After careful analysis of their reported NMR data, and comparison with those obtained for the synthetic bis-indoles reported herein, all of the structures have been revised to α-methine bis(3'-indolyl) alkaloids.


Assuntos
Antibacterianos , Quimioinformática , Alcaloides Indólicos , Antibacterianos/farmacologia , Antibacterianos/química , Antibacterianos/síntese química , Alcaloides Indólicos/química , Alcaloides Indólicos/farmacologia , Alcaloides Indólicos/síntese química , Quimioinformática/métodos , Testes de Sensibilidade Microbiana , Estrutura Molecular , Relação Estrutura-Atividade , Produtos Biológicos/química , Produtos Biológicos/farmacologia , Produtos Biológicos/síntese química
8.
BMC Bioinformatics ; 25(1): 225, 2024 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-38926641

RESUMO

PURPOSE: Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) from OpenAI and LLaMA (Large Language Model Meta AI) from Meta AI are increasingly recognized for their potential in the field of cheminformatics, particularly in understanding Simplified Molecular Input Line Entry System (SMILES), a standard method for representing chemical structures. These LLMs also have the ability to decode SMILES strings into vector representations. METHOD: We investigate the performance of GPT and LLaMA compared to pre-trained models on SMILES in embedding SMILES strings on downstream tasks, focusing on two key applications: molecular property prediction and drug-drug interaction prediction. RESULTS: We find that SMILES embeddings generated using LLaMA outperform those from GPT in both molecular property and DDI prediction tasks. Notably, LLaMA-based SMILES embeddings show results comparable to pre-trained models on SMILES in molecular prediction tasks and outperform the pre-trained models for the DDI prediction tasks. CONCLUSION: The performance of LLMs in generating SMILES embeddings shows great potential for further investigation of these models for molecular embedding. We hope our study bridges the gap between LLMs and molecular embedding, motivating additional research into the potential of LLMs in the molecular representation field. GitHub: https://github.com/sshaghayeghs/LLaMA-VS-GPT .


Assuntos
Quimioinformática , Quimioinformática/métodos , Interações Medicamentosas , Estrutura Molecular
9.
J Chem Inf Model ; 64(11): 4392-4409, 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38815246

RESUMO

By accelerating time-consuming processes with high efficiency, computing has become an essential part of many modern chemical pipelines. Machine learning is a class of computing methods that can discover patterns within chemical data and utilize this knowledge for a wide variety of downstream tasks, such as property prediction or substance generation. The complex and diverse chemical space requires complex machine learning architectures with great learning power. Recently, learning models based on transformer architectures have revolutionized multiple domains of machine learning, including natural language processing and computer vision. Naturally, there have been ongoing endeavors in adopting these techniques to the chemical domain, resulting in a surge of publications within a short period. The diversity of chemical structures, use cases, and learning models necessitate a comprehensive summarization of existing works. In this paper, we review recent innovations in adapting transformers to solve learning problems in chemistry. Because chemical data is diverse and complex, we structure our discussion based on chemical representations. Specifically, we highlight the strengths and weaknesses of each representation, the current progress of adapting transformer architectures, and future directions.


Assuntos
Quimioinformática , Aprendizado de Máquina , Quimioinformática/métodos
10.
Food Chem ; 454: 139794, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-38797094

RESUMO

Sweet potatoes are rich in cardioprotective phytochemicals with potential anti-platelet aggregation activity, although this benefit may vary among cultivars/genotypes. The phenolic profile [HPLC-ESI(-)-qTOF-MS2], cheminformatics (ADMET properties, affinity toward platelet proteins) and anti-PA activity of phenolic-rich hydroalcoholic extracts obtained from orange (OSP) and purple (PSP) sweet potato storage roots, was evaluated. The phenolic richness [Hydroxycinnamic acids> flavonoids> benzoic acids] was PSP > OSP. Their main chlorogenic acids could interact with platelet proteins (integrins/adhesins, kinases/metalloenzymes) but their bioavailability could be poor. Just OSP exhibited a dose-dependent anti-platelet aggregation activity [inductor (IC50, mg.ml-1): thrombin receptor activator peptide-6 (0.55) > Adenosine-5'-diphosphate (1.02) > collagen (1.56)] and reduced P-selectin expression (0.75-1.0 mg.ml-1) but not glycoprotein IIb/IIIa secretion. The explored anti-PA activity of OSP/PSP seems to be inversely related to their phenolic richness. The poor first-pass bioavailability of its chlorogenic acids (documented in silico) may represent a further obstacle for their anti-PA in vivo.


Assuntos
Ipomoea batatas , Fenóis , Extratos Vegetais , Raízes de Plantas , Inibidores da Agregação Plaquetária , Agregação Plaquetária , Ipomoea batatas/química , Fenóis/química , Agregação Plaquetária/efeitos dos fármacos , Extratos Vegetais/química , Extratos Vegetais/farmacologia , Inibidores da Agregação Plaquetária/química , Inibidores da Agregação Plaquetária/farmacologia , Raízes de Plantas/química , Humanos , Quimioinformática , Animais , Plaquetas/metabolismo , Plaquetas/efeitos dos fármacos
11.
Mol Inform ; 43(7): e202400018, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38803302

RESUMO

The growing interest in chemoinformatic model uncertainty calls for a summary of the most widely used regression techniques and how to estimate their reliability. Regression models learn a mapping from the space of explanatory variables to the space of continuous output values. Among other limitations, the predictive performance of the model is restricted by the training data used for model fitting. Identification of unusual objects by outlier detection methods can improve model performance. Additionally, proper model evaluation necessitates defining the limitations of the model, often called the applicability domain. Comparable to certain classifiers, some regression techniques come with built-in methods or augmentations to quantify their (un)certainty, while others rely on generic procedures. The theoretical background of their working principles and how to deduce specific and general definitions for their domain of applicability shall be explained.


Assuntos
Quimioinformática , Quimioinformática/métodos , Análise de Regressão
12.
ACS Chem Biol ; 19(4): 938-952, 2024 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-38565185

RESUMO

Phenotypic assays have become an established approach to drug discovery. Greater disease relevance is often achieved through cellular models with increased complexity and more detailed readouts, such as gene expression or advanced imaging. However, the intricate nature and cost of these assays impose limitations on their screening capacity, often restricting screens to well-characterized small compound sets such as chemogenomics libraries. Here, we outline a cheminformatics approach to identify a small set of compounds with likely novel mechanisms of action (MoAs), expanding the MoA search space for throughput limited phenotypic assays. Our approach is based on mining existing large-scale, phenotypic high-throughput screening (HTS) data. It enables the identification of chemotypes that exhibit selectivity across multiple cell-based assays, which are characterized by persistent and broad structure activity relationships (SAR). We validate the effectiveness of our approach in broad cellular profiling assays (Cell Painting, DRUG-seq, and Promotor Signature Profiling) and chemical proteomics experiments. These experiments revealed that the compounds behave similarly to known chemogenetic libraries, but with a notable bias toward novel protein targets. To foster collaboration and advance research in this area, we have curated a public set of such compounds based on the PubChem BioAssay dataset and made it available for use by the scientific community.


Assuntos
Descoberta de Drogas , Ensaios de Triagem em Larga Escala , Bibliotecas de Moléculas Pequenas , Descoberta de Drogas/métodos , Ensaios de Triagem em Larga Escala/métodos , Quimioinformática/métodos , Bibliotecas de Moléculas Pequenas/química , Relação Estrutura-Atividade
13.
Sci Rep ; 14(1): 9801, 2024 04 29.
Artigo em Inglês | MEDLINE | ID: mdl-38684706

RESUMO

The Covid-19 pandemic outbreak has accelerated tremendous efforts to discover a therapeutic strategy that targets severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to control viral infection. Various viral proteins have been identified as potential drug targets, however, to date, no specific therapeutic cure is available against the SARS-CoV-2. To address this issue, the present work reports a systematic cheminformatic approach to identify the potent andrographolide derivatives that can target methyltransferases of SARS-CoV-2, i.e. nsp14 and nsp16 which are crucial for the replication of the virus and host immune evasion. A consensus of cheminformatics methodologies including virtual screening, molecular docking, ADMET profiling, molecular dynamics simulations, free-energy landscape analysis, molecular mechanics generalized born surface area (MM-GBSA), and density functional theory (DFT) was utilized. Our study reveals two new andrographolide derivatives (PubChem CID: 2734589 and 138968421) as natural bioactive molecules that can form stable complexes with both proteins via hydrophobic interactions, hydrogen bonds and electrostatic interactions. The toxicity analysis predicts class four toxicity for both compounds with LD50 value in the range of 500-700 mg/kg. MD simulation reveals the stable formation of the complex for both the compounds and their average trajectory values were found to be lower than the control inhibitor and protein alone. MMGBSA analysis corroborates the MD simulation result and showed the lowest energy for the compounds 2734589 and 138968421. The DFT and MEP analysis also predicts the better reactivity and stability of both the hit compounds. Overall, both andrographolide derivatives exhibit good potential as potent inhibitors for both nsp14 and nsp16 proteins, however, in-vitro and in vivo assessment would be required to prove their efficacy and safety in clinical settings. Moreover, the drug discovery strategy aiming at the dual target approach might serve as a useful model for inventing novel drug molecules for various other diseases.


Assuntos
Antivirais , Diterpenos , Metiltransferases , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , SARS-CoV-2 , Proteínas não Estruturais Virais , Diterpenos/farmacologia , Diterpenos/química , SARS-CoV-2/efeitos dos fármacos , SARS-CoV-2/enzimologia , Metiltransferases/antagonistas & inibidores , Metiltransferases/química , Metiltransferases/metabolismo , Antivirais/farmacologia , Antivirais/química , Humanos , Proteínas não Estruturais Virais/antagonistas & inibidores , Proteínas não Estruturais Virais/química , Proteínas não Estruturais Virais/metabolismo , Quimioinformática/métodos , COVID-19/virologia , Inibidores Enzimáticos/química , Inibidores Enzimáticos/farmacologia , Tratamento Farmacológico da COVID-19
15.
J Am Chem Soc ; 146(12): 8016-8030, 2024 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-38470819

RESUMO

There have been significant advances in the flexibility and power of in vitro cell-free translation systems. The increasing ability to incorporate noncanonical amino acids and complement translation with recombinant enzymes has enabled cell-free production of peptide-based natural products (NPs) and NP-like molecules. We anticipate that many more such compounds and analogs might be accessed in this way. To assess the peptide NP space that is directly accessible to current cell-free technologies, we developed a peptide parsing algorithm that breaks down peptide NPs into building blocks based on ribosomal translation logic. Using the resultant data set, we broadly analyze the biophysical properties of these privileged compounds and perform a retrobiosynthetic analysis to predict which peptide NPs could be directly synthesized in augmented cell-free translation reactions. We then tested these predictions by preparing a library of highly modified peptide NPs. Two macrocyclases, PatG and PCY1, were used to effect the head-to-tail macrocyclization of candidate NPs. This retrobiosynthetic analysis identified a collection of high-priority building blocks that are enriched throughout peptide NPs, yet they had not previously been tested in cell-free translation. To expand the cell-free toolbox into this space, we established, optimized, and characterized the flexizyme-enabled ribosomal incorporation of piperazic acids. Overall, these results demonstrate the feasibility of cell-free translation for peptide NP total synthesis while expanding the limits of the technology. This work provides a novel computational tool for exploration of peptide NP chemical space, that could be expanded in the future to allow design of ribosomal biosynthetic pathways for NPs and NP-like molecules.


Assuntos
Produtos Biológicos , Produtos Biológicos/química , Quimioinformática , Peptídeos/química , Biossíntese Peptídica , Aminoácidos
16.
J Chem Inf Model ; 64(6): 1966-1974, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38437714

RESUMO

Chemical diversity is challenging to describe objectively. Despite this, various notions of chemical diversity are used throughout the medicinal chemistry optimization process in drug discovery. In this work, we show the usefulness of considering exploited vectors during different phases of the drug design process to provide a quantitative and objective description of chemical diversity. We have developed a concise and fast approach to enumerate and analyze the exploited vector patterns (EVPs) of molecular compound series, which can then be used in archetypal compound selection tasks, from hit matter identification to hit expansion and lead optimization. We first show that EVPs can be used to assess the progressibility of compounds in a fragment library design exercise. By considering EVPs, we then show how a set of compounds can be prioritized for hit expansion using EVP-based, customizable diversity sampling approaches, reducing the time taken and mitigating human biases. We also show that EVPs are a useful tool to analyze SAR data, offering the chance to uncover correlations between different vectors without predetermining the molecular scaffold structures. The codes used to perform these tasks are presented as easy-to-use Jupyter notebooks, which can be readily adapted for further related tasks.


Assuntos
Quimioinformática , Descoberta de Drogas , Humanos , Desenho de Fármacos , Estrutura Molecular , Química Farmacêutica
17.
J Chem Inf Model ; 64(8): 2948-2954, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38488634

RESUMO

SMARTS is a widely used language in cheminformatics for defining substructural queries for database lookups, reaction templates for chemical transformations, and other applications. As an extension to SMILES, many SMARTS patterns can represent the same query. Despite this, no canonicalization algorithm invariant of the line notation sequence or atomic numbering is publicly available. Here, we introduce RDCanon, an open-source Python package that can be used to standardize SMARTS queries. RDCanon is designed to ensure that the sequence of atomic queries remains consistent for all graphs representing the same substructure query and to ensure a canonical sequence of primitives within each individual atom query; furthermore, the algorithm can be applied to canonicalize the order of reactants, agents, and products and their atom map numbers in reaction SMARTS templates. As part of its canonicalization algorithm, RDCanon provides a mechanism in which the canonicalized SMARTS is optimized for speed against specific molecular databases. Several case studies are provided to showcase improved efficiency in substructure matching and retrosynthetic analysis.


Assuntos
Algoritmos , Software , Linguagens de Programação , Quimioinformática/métodos , Bases de Dados de Compostos Químicos
18.
J Chem Inf Model ; 64(8): 3173-3179, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38554112

RESUMO

In this work, we propose a versatile molecule and reaction encoding binary data format that aims to bridge the gap between the advantages of SMILES, like local stereo- and implicit hydrogen encoding, and block-structured MDL MOL with a 2D layout and explicit bond encoding, while addressing their respective limitations. Our new format introduces a balance between size efficiency, processing speed, and comprehensive representation, making it well-suited for various applications in cheminformatics, including deep learning, data storage, and searching. By offering an explicit approach to store atom connectivity (including implicit hydrogens), electronic state, stereochemistry, and other crucial molecular attributes, our proposal seeks to enhance data storage efficiency and promote interoperability among different software tools.


Assuntos
Quimioinformática , Software , Quimioinformática/métodos , Estrutura Molecular
19.
Adv Protein Chem Struct Biol ; 139: 27-55, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38448138

RESUMO

The integration of computational resources and chemoinformatics has revolutionized translational health research. It has offered a powerful set of tools for accelerating drug discovery. This chapter overviews the computational resources and chemoinformatics methods used in translational health research. The resources and methods can be used to analyze large datasets, identify potential drug candidates, predict drug-target interactions, and optimize treatment regimens. These resources have the potential to transform the drug discovery process and foster personalized medicine research. We discuss insights into their various applications in translational health and emphasize the need for addressing challenges, promoting collaboration, and advancing the field to fully realize the potential of these tools in transforming healthcare.


Assuntos
Quimioinformática , Descoberta de Drogas , Medicina de Precisão
20.
J Org Chem ; 89(7): 4932-4946, 2024 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-38451837

RESUMO

The concise synthesis of a small library of fluorinated piperidines from readily available dihydropyridinone derivatives has been described. The effect of the fluorination on different positions has then been evaluated by chemoinformatic tools. In particular, the compounds' pKa's have been calculated, revealing that the fluorine atoms notably lowered their basicity, which is correlated to the affinity for hERG channels resulting in cardiac toxicity. The "lead-likeness" and three-dimensionality have also been evaluated to assess their ability as useful fragments for drug design. A random screening on a panel of representative proteolytic enzymes was then carried out and revealed that one scaffold is recognized by the catalytic pocket of 3CLPro (main protease of SARS-CoV-2 coronavirus).


Assuntos
Quimioinformática , Descoberta de Drogas , SARS-CoV-2 , Desenho de Fármacos , Inibidores de Proteases/farmacologia , Antivirais/farmacologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA