Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Chem Inf Model ; 64(1): 9-17, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38147829

RESUMO

Deep learning has become a powerful and frequently employed tool for the prediction of molecular properties, thus creating a need for open-source and versatile software solutions that can be operated by nonexperts. Among the current approaches, directed message-passing neural networks (D-MPNNs) have proven to perform well on a variety of property prediction tasks. The software package Chemprop implements the D-MPNN architecture and offers simple, easy, and fast access to machine-learned molecular properties. Compared to its initial version, we present a multitude of new Chemprop functionalities such as the support of multimolecule properties, reactions, atom/bond-level properties, and spectra. Further, we incorporate various uncertainty quantification and calibration methods along with related metrics as well as pretraining and transfer learning workflows, improved hyperparameter optimization, and other customization options concerning loss functions or atom/bond features. We benchmark D-MPNN models trained using Chemprop with the new reaction, atom-level, and spectra functionality on a variety of property prediction data sets, including MoleculeNet and SAMPL, and observe state-of-the-art performance on the prediction of water-octanol partition coefficients, reaction barrier heights, atomic partial charges, and absorption spectra. Chemprop enables out-of-the-box training of D-MPNN models for a variety of problem settings in fast, user-friendly, and open-source software.


Assuntos
Aprendizado de Máquina , Software , Redes Neurais de Computação , Fenômenos Químicos , Água
2.
J Chem Inf Model ; 63(13): 4012-4029, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37338239

RESUMO

Characterizing uncertainty in machine learning models has recently gained interest in the context of machine learning reliability, robustness, safety, and active learning. Here, we separate the total uncertainty into contributions from noise in the data (aleatoric) and shortcomings of the model (epistemic), further dividing epistemic uncertainty into model bias and variance contributions. We systematically address the influence of noise, model bias, and model variance in the context of chemical property predictions, where the diverse nature of target properties and the vast chemical chemical space give rise to many different distinct sources of prediction error. We demonstrate that different sources of error can each be significant in different contexts and must be individually addressed during model development. Through controlled experiments on data sets of molecular properties, we show important trends in model performance associated with the level of noise in the data set, size of the data set, model architecture, molecule representation, ensemble size, and data set splitting. In particular, we show that 1) noise in the test set can limit a model's observed performance when the actual performance is much better, 2) using size-extensive model aggregation structures is crucial for extensive property prediction, and 3) ensembling is a reliable tool for uncertainty quantification and improvement specifically for the contribution of model variance. We develop general guidelines on how to improve an underperforming model when falling into different uncertainty contexts.


Assuntos
Aprendizado de Máquina , Incerteza , Reprodutibilidade dos Testes
3.
J Chem Phys ; 158(20)2023 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-37212411

RESUMO

A reliable uncertainty estimator is a key ingredient in the successful use of machine-learning force fields for predictive calculations. Important considerations are correlation with error, overhead during training and inference, and efficient workflows to systematically improve the force field. However, in the case of neural-network force fields, simple committees are often the only option considered due to their easy implementation. Here, we present a generalization of the deep-ensemble design based on multiheaded neural networks and a heteroscedastic loss. It can efficiently deal with uncertainties in both energy and forces and take sources of aleatoric uncertainty affecting the training data into account. We compare uncertainty metrics based on deep ensembles, committees, and bootstrap-aggregation ensembles using data for an ionic liquid and a perovskite surface. We demonstrate an adversarial approach to active learning to efficiently and progressively refine the force fields. That active learning workflow is realistically possible thanks to exceptionally fast training enabled by residual learning and a nonlinear learned optimizer.

4.
J Am Chem Soc ; 144(49): 22599-22610, 2022 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-36459170

RESUMO

The molecular structures synthesizable by organic chemists dictate the molecular functions they can create. The invention and development of chemical reactions are thus critical for chemists to access new and desirable functional molecules in all disciplines of organic chemistry. This work seeks to expedite the exploration of emerging areas of organic chemistry by devising a machine-learning-guided workflow for reaction discovery. Specifically, this study uses machine learning to predict competent electrochemical reactions. To this end, we first develop a molecular representation that enables the production of general models with limited training data. Next, we employ automated experimentation to test a large number of electrochemical reactions. These reactions are categorized as competent or incompetent mixtures, and a classification model was trained to predict reaction competency. This model is used to screen 38,865 potential reactions in silico, and the predictions are used to identify a number of reactions of synthetic or mechanistic interest, 80% of which are found to be competent. Additionally, we provide the predictions for the 38,865-member set in the hope of accelerating the development of this field. We envision that adopting a workflow such as this could enable the rapid development of many fields of chemistry.


Assuntos
Química Orgânica , Aprendizado de Máquina , Estrutura Molecular
5.
J Chem Inf Model ; 62(9): 2101-2110, 2022 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-34734699

RESUMO

The estimation of chemical reaction properties such as activation energies, rates, or yields is a central topic of computational chemistry. In contrast to molecular properties, where machine learning approaches such as graph convolutional neural networks (GCNNs) have excelled for a wide variety of tasks, no general and transferable adaptations of GCNNs for reactions have been developed yet. We therefore combined a popular cheminformatics reaction representation, the so-called condensed graph of reaction (CGR), with a recent GCNN architecture to arrive at a versatile, robust, and compact deep learning model. The CGR is a superposition of the reactant and product graphs of a chemical reaction and thus an ideal input for graph-based machine learning approaches. The model learns to create a data-driven, task-dependent reaction embedding that does not rely on expert knowledge, similar to current molecular GCNNs. Our approach outperforms current state-of-the-art models in accuracy, is applicable even to imbalanced reactions, and possesses excellent predictive capabilities for diverse target properties, such as activation energies, reaction enthalpies, rate constants, yields, or reaction classes. We furthermore curated a large set of atom-mapped reactions along with their target properties, which can serve as benchmark data sets for future work. All data sets and the developed reaction GCNN model are available online, free of charge, and open source.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Quimioinformática
6.
J Chem Inf Model ; 62(6): 1388-1398, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35271260

RESUMO

Multiparameter optimization, the heart of drug design, is still an open challenge. Thus, improved methods for automated compound design with multiple controlled properties are desired. Here, we present a significant extension to our previously described fragment-based reinforcement learning method (DeepFMPO) for the generation of novel molecules with optimal properties. As before, the generative process outputs optimized molecules similar to the input structures, now with the improved feature of replacing parts of these molecules with fragments of similar three-dimensional (3D) shape and electrostatics. We developed and benchmarked a new python package, ESP-Sim, for the comparison of the electrostatic potential and the molecular shape, allowing the calculation of high-quality partial charges (e.g., RESP with B3LYP/6-31G**) obtained using the quantum chemistry program Psi4. By performing comparisons of 3D fragments, we can simulate 3D properties while overcoming the notoriously difficult step of accurately describing bioactive conformations. The new improved generative (DeepFMPO v3D) method is demonstrated with a scaffold-hopping exercise identifying CDK2 bioisosteres. The code is open-source and freely available.


Assuntos
Desenho de Fármacos , Eletricidade Estática
7.
J Chem Inf Model ; 62(1): 16-26, 2022 01 10.
Artigo em Inglês | MEDLINE | ID: mdl-34939786

RESUMO

Heuristic and machine learning models for rank-ordering reaction templates comprise an important basis for computer-aided organic synthesis regarding both product prediction and retrosynthetic pathway planning. Their viability relies heavily on the quality and characteristics of the underlying template database. With the advent of automated reaction and template extraction software and consequently the creation of template databases too large for manual curation, a data-driven approach to assess and improve the quality of template sets is needed. We therefore systematically studied the influence of template generality, canonicalization, and exclusivity on the performance of different template ranking models. We find that duplicate and nonexclusive templates, i.e., templates which describe the same chemical transformation on identical or overlapping sets of molecules, decrease both the accuracy of the ranking algorithm and the applicability of the respective top-ranked templates significantly. To remedy the negative effects of nonexclusivity, we developed a general and computationally efficient framework to deduplicate and hierarchically correct templates. As a result, performance improved considerably for both heuristic and machine learning template ranking models, as well as multistep retrosynthetic planning models. The canonicalization and correction code is made freely available.


Assuntos
Algoritmos , Software , Computadores , Heurística , Aprendizado de Máquina
8.
Phys Chem Chem Phys ; 24(26): 15776-15790, 2022 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-35758401

RESUMO

We use polarizable molecular dynamics simulations to study the thermal dependence of both structural and dynamic properties of two ionic liquids sharing the same cation (1-ethyl-3-methylimidazolium). The linear temperature trend in the structure is accompanied by an exponential Arrhenius-like behavior of the dynamics. Our parameter-free Voronoi tessellation analysis directly casts doubt on common concepts such as the alternating shells of cations and anions and the ionicity. The latter tries to explain the physico-chemical properties of the ionic liquids based on the association and dissociation of an ion pair. However, cations are in the majority of both ion cages, around cations and around anions. There is no preference of a cation for a single anion. Collectivity is a key factor in the dynamic properties of ionic liquids. Consequently, collective rotation relaxes faster than single-particle rotations, and the activation energies for collective translation and rotation are lower than those of the single molecules.

9.
Int J Cancer ; 148(9): 2345-2351, 2021 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-33231291

RESUMO

Kaposiform hemangioendothelioma (KHE) is a rare vascular tumor in children, which can be accompanied by life-threatening thrombocytopenia, referred to as Kasabach-Merritt phenomenon (KMP). The mTOR inhibitor sirolimus is emerging as targeted therapy in KHE. As the sirolimus effect on KHE occurs only after several weeks, we aimed to evaluate whether additional transarterial embolization is of benefit for children with KHE and KMP. Seventeen patients with KHE and KMP acquired from 11 hospitals in Germany were retrospectively divided into two cohorts. Children being treated with adjunct transarterial embolization and systemic sirolimus, and those being treated with sirolimus without additional embolization. Bleeding grade as defined by WHO was determined for all patients. Response of the primary tumor at 6 and 12 months assessed by magnetic resonance imaging (MRI), time to response of KMP defined as thrombocyte increase >150 × 103 /µL, as well as rebound rates of both after cessation of sirolimus were compared. N = 8 patients had undergone additive embolization to systemic sirolimus therapy, sirolimus in this group was started after a mean of 6.5 ± 3 days following embolization. N = 9 patients were identified who had received sirolimus without additional embolization. Adjunct embolization induced a more rapid resolution of KMP within a median of 7 days vs 3 months; however, tumor response as well as rebound rates were similar between both groups. Additive embolization may be of value for a more rapid rescue of consumptive coagulopathy in children with KHE and KMP compared to systemic sirolimus only.


Assuntos
Embolização Terapêutica/métodos , Hemangioendotelioma/tratamento farmacológico , Síndrome de Kasabach-Merritt/tratamento farmacológico , Sarcoma de Kaposi/tratamento farmacológico , Sirolimo/uso terapêutico , Feminino , Humanos , Masculino , Estudos Retrospectivos , Sirolimo/farmacologia
10.
J Chem Inf Model ; 61(10): 4949-4961, 2021 10 25.
Artigo em Inglês | MEDLINE | ID: mdl-34587449

RESUMO

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply, and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here, we present EHreact, a purely data-driven open-source software tool, to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.


Assuntos
Computadores , Software , Bases de Dados Factuais , Probabilidade
11.
Phys Chem Chem Phys ; 23(2): 1616-1626, 2021 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-33410837

RESUMO

The Kamlet-Taft dipolarity/polarizability parameters π* for various ionic liquids were determined using 4-tert-butyl-2-((dicyanomethylene)-5-[4-N,N-diethylamino)-benzylidene]-Δ3-thiazoline and 5-(N,N-dimethylamino)-5'-nitro-2,2'-bithiophene as solvatochromic probes. In contrast to the established π*-probe N,N-diethylnitroaniline, the chromophores presented here show excellent agreement with polarity measurement using the chemical shift of 129Xe. They do not suffer from additional bathochromic UV/vis shifts caused by hydrogen-bonding resulting in too high π*-values for some ionic liquids. In combination with large sets of various ionic liquids, these new chromophores thereby allow for detailed analysis of the physical significance of π* and the comparison to quantum-mechanical methods. We find that π* correlates strongly with the ratio of molar refractivity to molar volume, and thus with the refractive index.

12.
J Chem Phys ; 155(7): 074504, 2021 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-34418918

RESUMO

Redox-active molecules are of interest in many fields, such as medicine, catalysis, or energy storage. In particular, in supercapacitor applications, they can be grafted to ionic liquids to form so-called biredox ionic liquids. To completely understand the structural and transport properties of such systems, an insight at the molecular scale is often required, but few force fields are developed ad hoc for these molecules. Moreover, they do not include polarization effects, which can lead to inaccurate solvation and dynamical properties. In this work, we developed polarizable force fields for redox-active species anthraquinone (AQ) and 2,2,6,6-tetra-methylpiperidinyl-1-oxyl (TEMPO) in their oxidized and reduced states as well as for acetonitrile. We validate the structural properties of AQ, AQ•-, AQ2-, TEMPO•, and TEMPO+ in acetonitrile against density functional theory-based molecular dynamics simulations and we study the solvation of these redox molecules in acetonitrile. This work is a first step toward the characterization of the role played by AQ and TEMPO in electrochemical and catalytic devices.

13.
Phys Chem Chem Phys ; 22(33): 18388-18399, 2020 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-32797139

RESUMO

Different types of spectroscopy capture different aspects of dynamics and different ranges of intermolecular contributions. In this article, we investigate the dielectric relaxation spectroscopy (DRS) of collective nature and the time-dependent Stokes shift (TDSS) of disputed nature. Our computational study of unconfined and confined water clearly demonstrates that the TDSS reflects local, non-collective dynamics. Surprisingly, we found that the reaction field continuum model (RFCM) used to estimate TDSS curves solely from collective DRS spectra correctly transforms collective dynamics to local ones even in cases when the relaxation time trends are quite different. This correct transformation is possible due to structural information available in the DRS amplitude in a Kivelsen-Madden like context.

14.
J Chem Phys ; 152(9): 094105, 2020 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-33480729

RESUMO

Ionic liquids are an interesting class of soft matter with viscosities of one or two orders of magnitude higher than that of water. Unfortunately, classical, non-polarizable molecular dynamics (MD) simulations of ionic liquids result in too slow dynamics and demonstrate the need for explicit inclusion of polarizability. The inclusion of polarizability, here via the Drude oscillator model, requires amendments to the employed thermostat, where we consider a dual Nosé-Hoover thermostat, as well as a dual Langevin thermostat. We investigate the effects of the choice of a thermostat and the underlying parameters such as the masses and force constants of the Drude particles on static and dynamic properties of ionic liquids. Here, we show that Langevin thermostats are not suitable for investigating the dynamics of ionic liquids. Since polarizable MD simulations are associated with high computational costs, we employed a self-developed graphics processing unit enhanced code within the MD program CHARMM to keep the overall computational effort reasonable.

15.
Phys Chem Chem Phys ; 21(8): 4435-4443, 2019 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-30729972

RESUMO

The time-dependent Stokes shift (TDSS) has attracted increasing interest for measuring hydration dynamics around biomolecules during the last decades. Its ability to report on hydration dynamics around proteins, however, was questioned recently since the experimental signal stems from both water and protein motion with an unknown ratio of contribution. Using large-scale computer simulations, we examine the ability of the TDSS to capture local hydration dynamics at nine different sites around the protein ubiquitin. By computationally constraining protein motion, it is shown that the remaining water component is meaningful and in line with the picture of a heterogeneous yet overall mobile hydration layer. However, protein contributions are excessively large and cannot be removed in an experimental context, thus obscuring the water component. Consequently, we conclude that the experimental TDSS may not be suitable for the investigation of hydration dynamics around proteins.

16.
Phys Chem Chem Phys ; 21(3): 1023-1028, 2019 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-30601488

RESUMO

The validity of linear response theory (LRT) in computer simulations of solvation dynamics, i.e. the time-dependent Stokes shift, has been debated widely during the last decades. Since the use of LRT is computationally less expensive than the calculation of the true nonequilibrium response, it is often invoked for large systems exhibiting a particularly slow solvation response, e.g. ionic liquids. In the case of ionic liquids, LRT does not only need to capture the correct overall dynamics of the system, but also the contributions and timescales of the respective cation and anion movement. We show by large scale computer simulations that the contribution of the permanent dipoles to the solvation response obeys LRT to some extent, whereas the induced contributions in polarizable simulations lead to a failure of LRT for the respective ion contributions.

17.
Phys Chem Chem Phys ; 21(32): 17703-17710, 2019 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-31367711

RESUMO

The inclusion of explicit polarization in molecular dynamics simulation has gained increasing interest during the last several years. An understudied area is the role of polarizability in computer simulations of solvation dynamics around chromophores, particularly for the large solutes used in experimental studies. In this work, we present fully polarizable ground and excited state force fields for the common fluorophores N-methyl-6-oxyquinolium betaine and Coumarin 153. While analyzing the solvation responses in water, methanol, and the highly viscous ionic liquid 1-ethyl-3-methylimidazolium trifluoromethanesulfonate we found that the inclusion of solute polarizability considerably increases the agreement of the obtained Stokes shift relaxation functions with experimental data. Solute polarizability slows down the inertial solvation response in the femtosecond time regime and enables the chromophore to adapt its dipole moment to the environment. Furthermore, the developed chromophore force field reproduces the solute dipole moments in both the electronic ground and excited state in environments ranging from gas phase to highly polar media correctly. Based on these studies it is anticipated that polarizable models of chromophores will lead to an improved understanding of the relationship of their environment to their spectroscopic properties.

18.
J Chem Phys ; 150(17): 175102, 2019 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-31067863

RESUMO

The bioprotective nature of monosaccharides and disaccharides is often attributed to their ability to slow down the dynamics of adjacent water molecules. Indeed, solvation dynamics close to sugars is indisputably retarded compared to bulk water. However, further research is needed on the qualitative and quantitative differences between the water dynamics around different saccharides. Current studies on this topic disagree on whether the disaccharide trehalose retards water to a larger extent than other isomers. Based on molecular dynamics simulation of the time-dependent Stokes shift of a chromophore close to the saccharides trehalose, sucrose, maltose, and glucose, this study reports a slightly stronger retardation of trehalose compared to other sugars at room temperature and below. Calculation and analysis of the intermolecular nuclear Overhauser effect, nuclear quadrupole relaxation, dielectric relaxation spectroscopy, and first shell residence times at room temperature yield further insights into the hydration dynamics of different sugars and confirm that trehalose slows down water dynamics to a slightly larger extent than other sugars. Since the calculated observables span a wide range of timescales relevant to intermolecular nuclear motion, and correspond to different kinds of motions, this study allows for a comprehensive view on sugar hydration dynamics.

19.
Phys Chem Chem Phys ; 20(7): 5246-5255, 2018 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-29400383

RESUMO

This study presents the large scale computer simulations of two common fluorophores, N-methyl-6-oxyquinolinium betaine and coumarin 153, in five polar or ionic solvents. The validity of linear response approximations to calculate the time-dependent Stokes shift is evaluated in each system. In most studied systems linear response theory fails. In ionic liquids the magnitude of the overall response is largely overestimated, and linear response theory is not able to capture the individual contributions of cations and anions. In polar liquids, the timescales of solvation dynamics are often not correctly reproduced. These observations are complemented by a detailed analysis of Gaussian statistics including higher order correlation functions, variance of the energy gap distribution and its time evolution. The analysis of higher order correlation functions was found to be not suitable to predict a failure of linear response theory. Further analysis of radial distribution functions and hydrogen bonds in the ground and excited state, as well as the time evolution of the number of hydrogen bonds after solute excitation reveal an influence of solvent structure in some of the studied systems.

20.
Phys Chem Chem Phys ; 20(13): 8554-8563, 2018 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-29542743

RESUMO

Ground and excited state dipoles and polarizabilities of the chromophores N-methyl-6-oxyquinolinium betaine (MQ) and coumarin 153 (C153) in solution have been evaluated using time-dependent density functional theory (TD-DFT). A method for determining the atomic polarizabilities has been developed; the molecular dipole has been decomposed into atomic charge transfer and polarizability terms, and variation in the presence of an electric field has been used to evaluate atomic polarizabilities. On excitation, MQ undergoes very site-specific changes in polarizability while C153 shows significantly less variation. We also conclude that MQ cannot be adequately described by standard atomic polarizabilities based on atomic number and hybridization state. Changes in the molecular polarizability of MQ (on excitation) are not representative of the local site-specific changes in atomic polarizability, thus the overall molecular polarizability ratio does not provide a good approximation for local atom-specific polarizability changes on excitation. Accurate excited state force fields are needed for computer simulation of solvation dynamics. The chromophores considered in this study are often used as molecular probes. The methods and data reported here can be used for the construction of polarizable ground and excited state force fields. Atomic and molecular polarizabilities (ground and excited states) have been evaluated over a range of functionals and basis sets. Different mechanisms for including solvation effects have been examined; using a polarizable continuum model, explicit solvation and via sampling of clusters extracted from a MD simulation. A range of different solvents have also been considered.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA