Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 115
Filtrar
Más filtros

Base de datos
Tipo del documento
Intervalo de año de publicación
1.
J Chem Inf Model ; 64(10): 4286-4297, 2024 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-38708520

RESUMEN

C-H borylation is a high-value transformation in the synthesis of lead candidates for the pharmaceutical industry because a wide array of downstream coupling reactions is available. However, predicting its regioselectivity, especially in drug-like molecules that may contain multiple heterocycles, is not a trivial task. Using a data set of borylation reactions from Reaxys, we explored how a language model originally trained on USPTO_500_MT, a broad-scope set of patent data, can be used to predict the C-H borylation reaction product in different modes: product generation and site reactivity classification. Our fine-tuned T5Chem multitask language model can generate the correct product in 79% of cases. It can also classify the reactive aromatic C-H bonds with 95% accuracy and 88% positive predictive value, exceeding purpose-developed graph-based neural networks.


Asunto(s)
Hidrógeno , Hidrógeno/química , Modelos Químicos , Redes Neurales de la Computación
2.
J Chem Inf Model ; 64(8): 3180-3191, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38533705

RESUMEN

In the pursuit of improved compound identification and database search tasks, this study explores heteronuclear single quantum coherence (HSQC) spectra simulation and matching methodologies. HSQC spectra serve as unique molecular fingerprints, enabling a valuable balance of data collection time and information richness. We conducted a comprehensive evaluation of the following four HSQC simulation techniques: ACD/Labs (ACD), MestReNova (MNova), Gaussian NMR calculations (DFT), and a graph-based neural network (ML). For the latter two techniques, we developed a reconstruction logic to combine proton and carbon 1D spectra into HSQC spectra. The methodology involved the implementation of three peak-matching strategies (minimum-sum, Euclidean-distance, and Hungarian distance) combined with three padding strategies (zero-padding, peak-truncated, and nearest-neighbor double assignment). We found that coupling these strategies with a robust simulation technique facilitates the accurate identification of correct molecules from similar analogues (regio- and stereoisomers) and allows for fast and accurate large database searches. Furthermore, we demonstrated the efficacy of the best-performing methodology by rectifying the structures of a set of previously misidentified molecules. This research indicates that effective HSQC spectral simulation and matching methodologies significantly facilitate molecular structure elucidation. Furthermore, we offer a Google Colab notebook for researchers to use our methods on their own data (https://github.com/AstraZeneca/hsqc_structure_elucidation.git).


Asunto(s)
Simulación por Computador , Redes Neurales de la Computación
3.
Chemistry ; 30(28): e202303872, 2024 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-38477400

RESUMEN

Owing to its high natural abundance compared to the commonly used transition (precious) metals, as well as its high Lewis acidity and ability to change oxidation state, aluminium has recently been explored as the basis for a range of single-site catalysts. This paper aims to establish the ground rules for the development of a new type of cationic alkene oligomerisation catalyst containing two Al(III) ions, with the potential to act co-operatively in stereoselective assembly. Five new dimers of the type [R2Al(2-py')]2 (R=Me, iBu; py'=substituted pyridyl group) with different substituents on the Al atoms and pyridyl rings have been synthesised. The formation of the undesired cis isomers can be suppressed by the presence of substituents on the 6-position of the pyridyl ring due to steric congestion, with DFT calculations showing that the selection of the trans isomer is thermodynamically controlled. Calculations show that demethylation of the dimers [Me2Al(2-py')]2 with Ph3C+ to the cations [{MeAl(2-py')}2(µ-Me)]+ is highly favourable and that the desired trans disposition of the 2-pyridyl ring units is influenced by steric effects. Preliminary experimental studies confirm that demethylation of [Me2Al(6-MeO-2-py)]2 can be achieved using [Ph3C][B(C6F5)4].

4.
Chem Sci ; 14(43): 12355-12365, 2023 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-37969604

RESUMEN

The selectivity in a group of oxazaborolidinium ion-catalysed reactions between aldehyde and diazo compounds cannot be explained using transition state theory. VRAI-selectivity, developed to predict the outcome of dynamically controlled reactions, can account for both the chemo- and the stereo-selectivity in these reactions, which are controlled by reaction dynamics. Subtle modifications to the substrate or catalyst substituents alter the potential energy surface, leading to changes in predominant reaction pathways and altering the barriers to the major product when reaction dynamics are considered. In addition, this study suggests an explanation for the mysterious inversion of enantioselectivity resulting from the inclusion of an orthoiPrO group in the catalyst.

5.
Environ Sci Technol ; 57(46): 18259-18270, 2023 Nov 21.
Artículo en Inglés | MEDLINE | ID: mdl-37914529

RESUMEN

Machine Learning (ML) is increasingly applied to fill data gaps in assessments to quantify impacts associated with chemical emissions and chemicals in products. However, the systematic application of ML-based approaches to fill chemical data gaps is still limited, and their potential for addressing a wide range of chemicals is unknown. We prioritized chemical-related parameters for chemical toxicity characterization to inform ML model development based on two criteria: (1) each parameter's relevance to robustly characterize chemical toxicity described by the uncertainty in characterization results attributable to each parameter and (2) the potential for ML-based approaches to predict parameter values for a wide range of chemicals described by the availability of chemicals with measured parameter data. We prioritized 13 out of 38 parameters for developing ML-based approaches, while flagging another nine with critical data gaps. For all prioritized parameters, we performed a chemical space analysis to assess further the potential for ML-based approaches to predict data for diverse chemicals considering the structural diversity of available measured data, showing that ML-based approaches can potentially predict 8-46% of marketed chemicals based on 1-10% with available measured data. Our results can systematically inform future ML model development efforts to address data gaps in chemical toxicity characterization.


Asunto(s)
Aprendizaje Automático , Humanos , Medición de Riesgo
6.
J Chem Inf Model ; 63(14): 4364-4375, 2023 07 24.
Artículo en Inglés | MEDLINE | ID: mdl-37428183

RESUMEN

CONFPASS (Conformer Prioritizations and Analysis for DFT re-optimizations) has been developed to extract dihedral angle descriptors from conformational searching outputs, perform clustering, and return a priority list for density functional theory (DFT) re-optimizations. Evaluations were conducted with DFT data of the conformers for 150 structurally diverse molecules, most of which are flexible. CONFPASS gives a confidence estimate that the global minimum structure has been found, and based on our dataset, we can have 90% confidence after optimizing half of the FF structures. Re-optimizing conformers in order of the FF energy often generates duplicate results; using CONFPASS, the duplication rate is reduced by a factor of 2 for the first 30% of the re-optimizations, which include the global minimum structure about 80% of the time.


Asunto(s)
Conformación Molecular , Termodinámica
7.
Angew Chem Int Ed Engl ; 62(26): e202304756, 2023 06 26.
Artículo en Inglés | MEDLINE | ID: mdl-37118885

RESUMEN

The epigenetic modification 5-methylcytosine plays a vital role in development, cell specific gene expression and disease states. The selective chemical modification of the 5-methylcytosine methyl group is challenging. Currently, no such chemistry exists. Direct functionalisation of 5-methylcytosine would improve the detection and study of this epigenetic feature. We report a xanthone-photosensitised process that introduces a 4-pyridine modification at a C(sp3 )-H bond in the methyl group of 5-methylcytosine. We propose a reaction mechanism for this type of reaction based on density functional calculations and apply transition state analysis to rationalise differences in observed reaction efficiencies between cyanopyridine derivatives. The reaction is initiated by single electron oxidation of 5-methylcytosine followed by deprotonation to generate the methyl group radical. Cross coupling of the methyl radical with 4-cyanopyridine installs a 4-pyridine label at 5-methylcytosine. We demonstrate use of the pyridination reaction to enrich 5-methylcytosine-containing ribonucleic acid.


Asunto(s)
5-Metilcitosina , Electrones , 5-Metilcitosina/química , Oxidación-Reducción , Catálisis , Epigénesis Genética
8.
J Cheminform ; 15(1): 36, 2023 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-36945031

RESUMEN

Vibrational circular dichroism (VCD) spectroscopy can generate the data required for the assignment of absolute configuration, but the spectra are hard to interpret. We have recorded VCD data for thirty pairs of small organic compounds and we use this database to validate a method for the automated analysis of VCD spectra and the assignment of absolute configuration: the Cai•factor (Configuration: absolute information). The analysis of the data demonstrates that the procedure is a reliable and time-efficient method for determination of absolute configuration, which gives both the assignment and a measure of confidence in the outcome, even when the spectra are imperfect. The majority of molecules tested have a high confidence score and all of these have the correct assignment.

9.
J Phys Chem A ; 127(11): 2628-2636, 2023 Mar 23.
Artículo en Inglés | MEDLINE | ID: mdl-36916916

RESUMEN

Computational reaction prediction has become a ubiquitous task in chemistry due to the potential value accurate predictions can bring to chemists. Boronic acids are widely used in industry; however, understanding how to avoid the protodeboronation side reaction remains a challenge. We have developed an algorithm for in silico prediction of the rate of protodeboronation of boronic acids. A general mechanistic model devised through kinetic studies of protodeboronation was found in the literature and forms the foundation on which the algorithm presented in this work is built. Protodeboronation proceeds through 7 distinct pathways, though for any particular boronic acid, only a subset of mechanistic pathways are active. The rate of each active mechanistic pathway is linearly correlated with its characteristic energy difference, which in turn can be determined using Density Functional Theory. We validated the algorithm using leave-one-out cross-validation on a data set of 50 boronic acids and made a further 50 rate predictions on academically and industrially important boronic acids out of sample. We believe this work will provide great assistance to chemists performing reactions that feature boronic acids, such as Suzuki-Miyaura and Chan-Evans-Lam couplings.

10.
Angew Chem Weinheim Bergstr Ger ; 135(26): e202304756, 2023 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-38516645

RESUMEN

The epigenetic modification 5-methylcytosine plays a vital role in development, cell specific gene expression and disease states. The selective chemical modification of the 5-methylcytosine methyl group is challenging. Currently, no such chemistry exists. Direct functionalisation of 5-methylcytosine would improve the detection and study of this epigenetic feature. We report a xanthone-photosensitised process that introduces a 4-pyridine modification at a C(sp3)-H bond in the methyl group of 5-methylcytosine. We propose a reaction mechanism for this type of reaction based on density functional calculations and apply transition state analysis to rationalise differences in observed reaction efficiencies between cyanopyridine derivatives. The reaction is initiated by single electron oxidation of 5-methylcytosine followed by deprotonation to generate the methyl group radical. Cross coupling of the methyl radical with 4-cyanopyridine installs a 4-pyridine label at 5-methylcytosine. We demonstrate use of the pyridination reaction to enrich 5-methylcytosine-containing ribonucleic acid.

11.
Chem Sci ; 13(24): 7204-7214, 2022 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-35799803

RESUMEN

The use of machine learning techniques in computational chemistry has gained significant momentum since large molecular databases are now readily available. Predictions of molecular properties using machine learning have advantages over the traditional quantum mechanics calculations because they can be cheaper computationally without losing the accuracy. We present a new extrapolatable and explainable molecular representation based on bonds, angles and dihedrals that can be used to train machine learning models. The trained models can accurately predict the electronic energy and the free energy of small organic molecules with atom types C, H N and O, with a mean absolute error of 1.2 kcal mol-1. The models can be extrapolated to larger organic molecules with an average error of less than 3.7 kcal mol-1 for 10 or fewer heavy atoms, which represent a chemical space two orders of magnitude larger. The rapid energy predictions of multiple molecules, up to 7 times faster than previous ML models of similar accuracy, has been achieved by sampling geometries around the potential energy surface minima. Therefore, the input geometries do not have to be located precisely on the minima and we show that accurate density functional theory energy predictions can be made from force-field optimised geometries with a mean absolute error 2.5 kcal mol-1.

12.
Chem Sci ; 13(12): 3507-3518, 2022 Mar 24.
Artículo en Inglés | MEDLINE | ID: mdl-35432857

RESUMEN

Whenever a new molecule is made, a chemist will justify the proposed structure by analysing the NMR spectra. The widely-used DP4 algorithm will choose the best match from a series of possibilities, but draws no conclusions from a single candidate structure. Here we present the DP5 probability, a step-change in the quantification of molecular uncertainty: given one structure and one 13C NMR spectra, DP5 gives the probability of the structure being correct. We show the DP5 probability can rapidly differentiate between structure proposals indistinguishable by NMR to an expert chemist. We also show in a number of challenging examples the DP5 probability may prevent incorrect structures being published and later reassigned. DP5 will prove extremely valuable in fields such as discovery-driven automated chemical synthesis and drug development. Alongside the DP4-AI package, DP5 can help guide synthetic chemists when resolving the most subtle structural uncertainty. The DP5 system is available at https://github.com/Goodman-lab/DP5.

13.
Org Biomol Chem ; 19(44): 9565-9618, 2021 11 18.
Artículo en Inglés | MEDLINE | ID: mdl-34723293

RESUMEN

N-Triflylphosphoramides (NTPA), have become increasingly popular catalysts in the development of enantioselective transformations as they are stronger Brønsted acids than the corresponding phosphoric acids (PA). Their highly acidic, asymmetric active site can activate difficult, unreactive substrates. In this review, we present an account of asymmetric transformations using this type of catalyst that have been reported in the past ten years and we classify these reactions using the enantio-determining step as the key criterion. This compendium of NTPA-catalysed reactions is organised into the following categories: (1) cycloadditions, (2) electrocyclisations, polyene and related cyclisations, (3) addition reactions to imines, (4) electrophilic aromatic substitutions, (5) addition reactions to carbocations, (6) aldol and related reactions, (7) addition reactions to double bonds, and (8) rearrangements and desymmetrisations. We highlight the use of NTPA in total synthesis and suggest mnemonics which account for their enantioselectivity.

16.
Org Biomol Chem ; 19(17): 3940-3947, 2021 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-33949564

RESUMEN

In recent years, a growing number of organic reactions in the literature have shown selectivity controlled by reaction dynamics rather than by transition state theory. Such reactions are difficult to analyse because the transition state theory approach often does not capture the subtlety of the energy landscapes the compounds traverse and, therefore, cannot accurately predict the selectivity. We present an algorithm that can predict the major product and selectivity for a wide range of potential energy surfaces where the product distribution is influenced by reaction dynamics. The method requires as input calculation of the transition states, the intermediate (if present) and the product geometries. The algorithm is quick and simple to run and, except for two reactions with long alkyl chains, calculates selectivity more accurately than transition state theory alone.

17.
J Cheminform ; 13(1): 40, 2021 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-34030732

RESUMEN

The software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable. It has been tested on large databases around the world, and has proved itself to be an essential tool in the handling and integration of large chemical databases. InChI version 1.05 was released in January 2017 and version 1.06 in December 2020. In this paper, we report on the current state of the InChI Software, the details of the improvements in the v1.06 release, and the results of a test of the InChI run on PubChem, a database of more than a hundred million molecules. The upgrade introduces significant new features, including support for pseudo-element atoms and an improved description of polymers. We expect that few, if any, applications using the standard InChI will need to change as a result of the changes in version 1.06. Numerical instability was discovered for 0.002% of this database, and a small number of other molecules were discovered for which the algorithm did not run smoothly. On the basis of PubChem data, we can demonstrate that InChI version 1.05 was 99.996% accurate, and InChI version 1.06 represents a step closer to perfection. Finally, we look forward to future releases and extensions for the InChI Chemical identifier.

18.
Environ Health Perspect ; 129(4): 47013, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33929906

RESUMEN

BACKGROUND: Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditional rodent acute toxicity tests. In silico models built using existing data facilitate rapid acute toxicity predictions without using animals. OBJECTIVES: The U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) Acute Toxicity Workgroup organized an international collaboration to develop in silico models for predicting acute oral toxicity based on five different end points: Lethal Dose 50 (LD50 value, U.S. Environmental Protection Agency hazard (four) categories, Globally Harmonized System for Classification and Labeling hazard (five) categories, very toxic chemicals [LD50 (LD50≤50mg/kg)], and nontoxic chemicals (LD50>2,000mg/kg). METHODS: An acute oral toxicity data inventory for 11,992 chemicals was compiled, split into training and evaluation sets, and made available to 35 participating international research groups that submitted a total of 139 predictive models. Predictions that fell within the applicability domains of the submitted models were evaluated using external validation sets. These were then combined into consensus models to leverage strengths of individual approaches. RESULTS: The resulting consensus predictions, which leverage the collective strengths of each individual model, form the Collaborative Acute Toxicity Modeling Suite (CATMoS). CATMoS demonstrated high performance in terms of accuracy and robustness when compared with in vivo results. DISCUSSION: CATMoS is being evaluated by regulatory agencies for its utility and applicability as a potential replacement for in vivo rat acute oral toxicity studies. CATMoS predictions for more than 800,000 chemicals have been made available via the National Toxicology Program's Integrated Chemical Environment tools and data sets (ice.ntp.niehs.nih.gov). The models are also implemented in a free, standalone, open-source tool, OPERA, which allows predictions of new and untested chemicals to be made. https://doi.org/10.1289/EHP8495.


Asunto(s)
Agencias Gubernamentales , Animales , Simulación por Computador , Ratas , Pruebas de Toxicidad Aguda , Estados Unidos , United States Environmental Protection Agency
19.
Chem Res Toxicol ; 34(2): 217-239, 2021 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-33356168

RESUMEN

In recent times, machine learning has become increasingly prominent in predictive toxicology as it has shifted from in vivo studies toward in silico studies. Currently, in vitro methods together with other computational methods such as quantitative structure-activity relationship modeling and absorption, distribution, metabolism, and excretion calculations are being used. An overview of machine learning and its applications in predictive toxicology is presented here, including support vector machines (SVMs), random forest (RF) and decision trees (DTs), neural networks, regression models, naïve Bayes, k-nearest neighbors, and ensemble learning. The recent successes of these machine learning methods in predictive toxicology are summarized, and a comparison of some models used in predictive toxicology is presented. In predictive toxicology, SVMs, RF, and DTs are the dominant machine learning methods due to the characteristics of the data available. Lastly, this review describes the current challenges facing the use of machine learning in predictive toxicology and offers insights into the possible areas of improvement in the field.


Asunto(s)
Aprendizaje Automático , Pruebas de Toxicidad , Humanos , Modelos Moleculares , Relación Estructura-Actividad Cuantitativa
20.
Chem Res Toxicol ; 33(12): 3010-3022, 2020 12 21.
Artículo en Inglés | MEDLINE | ID: mdl-33295767

RESUMEN

Having a measure of confidence in computational predictions of biological activity from in silico tools is vital when making predictions for new chemicals, for example, in chemical risk assessment. Where predictions of biological activity are used as an indicator of a potential hazard, false-negative predictions are the most concerning prediction; however, assigning confidence in inactive predictions is particularly challenging. How can one confidently identify the absence of activating features? In this study, we present methods for assigning confidence to both active and inactive predictions from structural alerts for protein-binding molecular initiating events (MIEs). Structural alerts were derived through an iterative statistical method. Confidence in the activity predictions is assigned by measuring the Tanimoto similarity between Morgan fingerprints of chemicals in the test set to relevant chemicals in the training set, and suitable cutoff values have been defined to give different confidence categories. To avoid a potential compound series bias in the test set and hence overestimate the performance of the method, we measured the biological activity of 27 compounds with 24 proteins, which gave us an additional 648 experimental measurements; many of the measurements are currently nonexistent in the literature and databases. This data set was complemented with newly measured biological activities published in ChEMBL25 and formed a combined independent validation data set. Applying the confidence categories to the computational predictions for the new data leads to the identification of chemicals for which one should be confident of either an inactive or active prediction, allowing model predictions to be used responsibly.


Asunto(s)
Compuestos Orgánicos/química , Proteínas/química , Bases de Datos Factuales , Estructura Molecular
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA