Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Phys Chem Chem Phys ; 26(4): 3540-3547, 2024 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-38214052

RESUMO

Classical molecular dynamics (MD) simulations without bond forming/breaking cannot be used to model chemical reactions (CRs) among small molecules. Although the first-principle MD simulation can adequately describe CRs with explicit water molecules, such simulation is normally too costly for most researchers to afford. Generally, water molecules in a solvent can exert hydrophobic forces on reacting molecules, which yields a so-called caging effect that cannot be ignored when constructing a free energy landscape for reacting molecules. Many recently developed semi-empirical methods (such as DFTB, PM6 and xTB) are highly efficient for modeling CRs, however none of them can be directly used to model bulk water properly. Here, we developed a modified xTB approach that enables the simulation of CRs in explicit water. Using the chemisorption of CO2 by amines in water as an example application, we demonstrate that our approach yielded results comparable with the first-principle ones, while only using a limited computing resource. Potentially, our proposed semi-empirical water model can be utilized for the computational study of any CR in water.

2.
J Chem Inf Model ; 63(4): 1099-1113, 2023 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-36758178

RESUMO

Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state of the art, the American Chemical Society organized a "Second Solubility Challenge" in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019 but which have not previously been reported. These models were based on computationally inexpensive molecular descriptors and traditional machine learning algorithms and were trained on a relatively small data set of 300 molecules. In the second part of the article, to test the hypothesis that predictions would improve with more advanced algorithms and higher volumes of training data, we compare these original predictions with those made after the deadline using deep learning models trained on larger solubility data sets consisting of 2999 and 5697 molecules. The results show that there are several algorithms that are able to obtain near state-of-the-art performance on the solubility challenge data sets, with the best model, a graph convolutional neural network, resulting in an RMSE of 0.86 log units. Critical analysis of the models reveals systematic differences between the performance of models using certain feature sets and training data sets. The results suggest that careful selection of high quality training data from relevant regions of chemical space is critical for prediction accuracy but that other methodological issues remain problematic for machine learning solubility models, such as the difficulty in modeling complex chemical spaces from sparse training data sets.


Assuntos
Aprendizado Profundo , Solubilidade , Redes Neurais de Computação , Aprendizado de Máquina , Algoritmos
3.
J Chem Phys ; 159(2)2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-37435943

RESUMO

The ability to predict transport properties of fluids, such as the self-diffusion coefficient and viscosity, has been an ongoing effort in the field of molecular modeling. While there are theoretical approaches to predict the transport properties of simple systems, they are typically applied in the dilute gas regime and are not directly applicable to more complex systems. Other attempts to predict transport properties are performed by fitting available experimental or molecular simulation data to empirical or semi-empirical correlations. Recently, there have been attempts to improve the accuracy of these fittings through the use of Machine-Learning (ML) methods. In this work, the application of ML algorithms to represent the transport properties of systems comprising spherical particles interacting via the Mie potential is investigated. To this end, the self-diffusion coefficient and shear viscosity of 54 potentials are obtained at different regions of the fluid-phase diagram. This data set is used together with three ML algorithms, namely, k-Nearest Neighbors (KNN), Artificial Neural Network (ANN), and Symbolic Regression (SR), to find correlations between the parameters of each potential and the transport properties at different densities and temperatures. It is shown that ANN and KNN perform to a similar extent, followed by SR, which exhibits larger deviations. Finally, the application of the three ML models to predict the self-diffusion coefficient of small molecular systems, such as krypton, methane, and carbon dioxide, is demonstrated using molecular parameters derived from the so-called SAFT-VR Mie equation of state [T. Lafitte et al. J. Chem. Phys. 139, 154504 (2013)] and available experimental vapor-liquid coexistence data.

4.
J Chem Inf Model ; 59(10): 4278-4288, 2019 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-31549507

RESUMO

We present a machine learning approach to automated force field development in dissipative particle dynamics (DPD). The approach employs Bayesian optimization to parametrize a DPD force field against experimentally determined partition coefficients. The optimization process covers a discrete space of over 40 000 000 points, where each point represents the set of potentials that jointly forms a force field. We find that Bayesian optimization is capable of reaching a force field of comparable performance to the current state-of-the-art within 40 iterations. The best iteration during the optimization achieves an R2 of 0.78 and an RMSE of 0.63 log units on the training set of data, these metrics are maintained when a validation set is included, giving R2 of 0.8 and an RMSE of 0.65 log units. This work hence provides a proof-of-concept, expounding the utility of coupling automated and efficient global optimization with a top down data driven approach to force field parametrization. Compared to commonly employed alternative methods, Bayesian optimization offers global parameter searching and a low time to solution.


Assuntos
Aprendizado de Máquina , Simulação de Dinâmica Molecular , Algoritmos , Teorema de Bayes , Engenharia Química/métodos , Termodinâmica
5.
Chemphyschem ; 18(23): 3360-3368, 2017 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-29094804

RESUMO

The electronic effects that govern the cohesion of water clusters are complex, demanding the inclusion of N-body, Coulomb, exchange and correlation effects. Here we present a much needed quantitative study of the effect of correlation (and hence dispersion) energy on the stabilization of water clusters. For this purpose we used a topological energy partitioning method called Interacting Quantum Atoms (IQA) to partition water clusters into topological atoms, based on a MP2/6-31G(d,p) wave function, and modified versions of GAUSSIAN09 and the Quantum Chemical Topology (QCT) program MORFI. Most of the cohesion in the water clusters provided by electron correlation comes from intramolecular energy stabilization. Hydrogen bond-related interactions tend to largely cancel each other. Electron correlation energies are transferable in almost all instances within 1 kcal mol-1 . This observed transferability is very important to the further development of the QCT force field FFLUX, especially to the future modelling of liquid water.

6.
J Chem Inf Model ; 56(11): 2162-2179, 2016 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-27749062

RESUMO

We compare a range of computational methods for the prediction of sublimation thermodynamics (enthalpy, entropy, and free energy of sublimation). These include a model from theoretical chemistry that utilizes crystal lattice energy minimization (with the DMACRYS program) and quantitative structure property relationship (QSPR) models generated by both machine learning (random forest and support vector machines) and regression (partial least squares) methods. Using these methods we investigate the predictability of the enthalpy, entropy and free energy of sublimation, with consideration of whether such a method may be able to improve solubility prediction schemes. Previous work has suggested that the major source of error in solubility prediction schemes involving a thermodynamic cycle via the solid state is in the modeling of the free energy change away from the solid state. Yet contrary to this conclusion other work has found that the inclusion of terms such as the enthalpy of sublimation in QSPR methods does not improve the predictions of solubility. We suggest the use of theoretical chemistry terms, detailed explicitly in the Methods section, as descriptors for the prediction of the enthalpy and free energy of sublimation. A data set of 158 molecules with experimental sublimation thermodynamics values and some CSD refcodes has been collected from the literature and is provided with their original source references.


Assuntos
Informática/métodos , Compostos Orgânicos/química , Transição de Fase , Entropia , Modelos Moleculares , Conformação Molecular , Relação Quantitativa Estrutura-Atividade
7.
J Chem Inf Model ; 54(3): 844-56, 2014 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-24564264

RESUMO

We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure-property relationship (QSPR) models. We also develop machine learning models where the theoretical energies and cheminformatics descriptors are used as combined input. These models are used to predict solvation free energy. While direct theoretical calculation does not give accurate results in this approach, machine learning is able to give predictions with a root mean squared error (RMSE) of ~1.1 log S units in a 10-fold cross-validation for our Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules. We find that a model built using energy terms from our theoretical methodology as descriptors is marginally less predictive than one built on Chemistry Development Kit (CDK) descriptors. Combining both sets of descriptors allows a further but very modest improvement in the predictions. However, in some cases, this is a statistically significant enhancement. These results suggest that there is little complementarity between the chemical information provided by these two sets of descriptors, despite their different sources and methods of calculation. Our machine learning models are also able to predict the well-known Solubility Challenge dataset with an RMSE value of 0.9-1.0 log S units.


Assuntos
Modelos Químicos , Preparações Farmacêuticas/química , Inteligência Artificial , Cristalização , Modelos Moleculares , Solubilidade , Termodinâmica , Água/química
8.
J Phys Chem B ; 123(7): 1696-1707, 2019 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-30657322

RESUMO

We wished to compile a data set of results from the experimental literature to support the development and validation of accurate computational models (force fields) for an important class of micelle-forming nonionic surfactant compounds, the poly(ethylene oxide) alkyl ethers, usually denoted C nE m. However, careful examination of the experimental literature exposed a striking degree of variation in values reported for critical micelle concentrations (cmc) and mean aggregation numbers ( Nagg). This variation was so large that it masked important trends known to exist within this family of molecules, thereby rendering most of the literature data to be of limited utility for force field development. In this work, we describe some reasons for the wide variability in the experimental literature, and we present a set of cmc and aggregation number data for 12 C nE m compounds that we feel is appropriate to use for the construction of and validation of computational models. The cmc values we selected are from the existing experimental literature and represent a carefully chosen and consistent subset that conveys important trends seen by many of the experimental studies. However, for a corresponding and consistent set of weight-averaged aggregation numbers, we needed to perform new dynamic light scattering (DLS) experiments. The results of these experiments were carefully analyzed to obtain not just mean aggregation numbers but also the underlying micelle size distribution functions. Several trends observed in the cmc and Nagg observables are highlighted and serve as challenges for developers of force field and simulation methodology. The analysis of the DLS experiments accounts for the fact that a broad distribution of micelle sizes exists for many of these compounds and that one must be careful to use the appropriate weighted averages (e.g., mass-weighted vs number-weighted averages) in comparing results from different types of experiments and in comparing results from experiments with those from simulations.

9.
J Chem Theory Comput ; 14(1): 216-224, 2018 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-29211469

RESUMO

We present an innovative method for predicting the dynamic electron correlation energy of an atom or a bond in a molecule utilizing topological atoms. Our approach uses the machine learning method Kriging (Gaussian Process Regression with a non-zero mean function) to predict these dynamic electron correlation energy contributions. The true energy values are calculated by partitioning the MP2 two-particle density-matrix via the Interacting Quantum Atoms (IQA) procedure. To our knowledge, this is the first time such energies have been predicted by a machine learning technique. We present here three important proof-of-concept cases: the water monomer, the water dimer, and the van der Waals complex H2···He. These cases represent the final step toward the design of a full IQA potential for molecular simulation. This final piece will enable us to consider situations in which dispersion is the dominant intermolecular interaction. The results from these examples suggest a new method by which dispersion potentials for molecular simulation can be generated.

10.
J Phys Chem Lett ; 8(9): 1937-1942, 2017 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-28402120

RESUMO

The Interacting Quantum Atoms (IQA) method is used to analyze the correlated part of the Møller-Plesset (MP) perturbation theory two-particle density matrix. Such an analysis determines the effects of electron correlation within atoms and between atoms, which covers both bonds and nonbonded through-space atom-atom interactions within a molecule or molecular complex. Electron correlation lowers the energy of the atoms at either end of a bond, but for the bond itself, it can be stabilizing or destabilizing. Bonds are described in a two-dimensional world of exchange and charge transfer, where covalency is not the opposite of ionicity.

11.
Springerplus ; 5: 259, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27006868

RESUMO

BACKGROUND: The increasing use of computers in science allows for the scientific analyses of large datasets at an increasing pace. We provided examples and interactive demonstrations at Dundee Science Centre as part of the 2015 Women in Science festival, to present aspects of computational science to the general public. We used low-cost Raspberry Pi computers to provide hands on experience in computer programming and demonstrated the application of computers to biology. Computer games were used as a means to introduce computers to younger visitors. The success of the event was evaluated by voluntary feedback forms completed by visitors, in conjunction with our own self-evaluation. This work builds on the original work of the 4273π bioinformatics education program of Barker et al. (2013, BMC Bioinform. 14:243). 4273π provides open source education materials in bioinformatics. This work looks at the potential to adapt similar materials for public engagement events. RESULTS: It appears, at least in our small sample of visitors (n = 13), that basic computational science can be conveyed to people of all ages by means of interactive demonstrations. Children as young as five were able to successfully edit simple computer programs with supervision. This was, in many cases, their first experience of computer programming. The feedback is predominantly positive, showing strong support for improving computational science education, but also included suggestions for improvement. CONCLUSIONS: Our conclusions are necessarily preliminary. However, feedback forms suggest methods were generally well received among the participants; "Easy to follow. Clear explanation" and "Very easy. Demonstrators were very informative." Our event, held at a local Science Centre in Dundee, demonstrates that computer games and programming activities suitable for young children can be performed alongside a more specialised and applied introduction to computational science for older visitors.

12.
Curr Top Med Chem ; 12(17): 1911-23, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23116471

RESUMO

Over the last 50 years, sequencing, structural biology and bioinformatics have completely revolutionised biomolecular science, with millions of sequences and tens of thousands of three dimensional structures becoming available. The bioinformatics of enzymes is well served by, mostly free, online databases. BRENDA describes the chemistry, substrate specificity, kinetics, preparation and biological sources of enzymes, while KEGG is valuable for understanding enzymes and metabolic pathways. EzCatDB, SFLD and MACiE are key repositories for data on the chemical mechanisms by which enzymes operate. At the current rate of genome sequencing and manual annotation, human curation will never finish the functional annotation of the ever-expanding list of known enzymes. Hence there is an increasing need for automated annotation, though it is not yet widespread for enzyme data. In contrast, functional ontologies such as the Gene Ontology already profit from automation. Despite our growing understanding of enzyme structure and dynamics, we are only beginning to be able to design novel enzymes. One can now begin to trace the functional evolution of enzymes using phylogenetics. The ability of enzymes to perform secondary functions, albeit relatively inefficiently, gives clues as to how enzyme function evolves. Substrate promiscuity in enzymes is one example of imperfect specificity in protein-ligand interactions. Similarly, most drugs bind to more than one protein target. This may sometimes result in helpful polypharmacology as a drug modulates plural targets, but also often leads to adverse side-effects. Many chemoinformatics approaches can be used to model the interactions between druglike molecules and proteins in silico. We can even use quantum chemical techniques like DFT and QM/MM to compute the structural and energetic course of enzyme catalysed chemical reaction mechanisms, including a full description of bond making and breaking.


Assuntos
Biologia Computacional , Enzimas/química , Bases de Dados de Proteínas , Enzimas/genética , Enzimas/metabolismo , Humanos , Teoria Quântica
13.
J Chem Theory Comput ; 8(9): 3322-37, 2012 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-26605739

RESUMO

We demonstrate that the intrinsic aqueous solubility of crystalline druglike molecules can be estimated with reasonable accuracy from sublimation free energies calculated using crystal lattice simulations and hydration free energies calculated using the 3D Reference Interaction Site Model (3D-RISM) of the Integral Equation Theory of Molecular Liquids (IET). The solubilities of 25 crystalline druglike molecules taken from different chemical classes are predicted by the model with a correlation coefficient of R = 0.85 and a root mean square error (RMSE) equal to 1.45 log10S units, which is significantly more accurate than results obtained using implicit continuum solvent models. The method is not directly parametrized against experimental solubility data, and it offers a full computational characterization of the thermodynamics of transfer of the drug molecule from crystal phase to gas phase to dilute aqueous solution.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA