Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
J Chem Phys ; 160(5)2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38341696

RESUMO

We study alchemical atomic energy partitioning as a method to estimate atomization energies from atomic contributions, which are defined in physically rigorous and general ways through the use of the uniform electron gas as a joint reference. We analyze quantitatively the relation between atomic energies and their local environment using a dataset of 1325 organic molecules. The atomic energies are transferable across various molecules, enabling the prediction of atomization energies with a mean absolute error of 23 kcal/mol, comparable to simple statistical estimates but potentially more robust given their grounding in the physics-based decomposition scheme. A comparative analysis with other decomposition methods highlights its sensitivity to electrostatic variations, underlining its potential as a representation of the environment as well as in studying processes like diffusion in solids characterized by significant electrostatic shifts.

2.
J Chem Theory Comput ; 19(23): 8861-8870, 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-38009856

RESUMO

Optimizing a target function over the space of organic molecules is an important problem appearing in many fields of applied science but also a very difficult one due to the vast number of possible molecular systems. We propose an evolutionary Monte Carlo algorithm for solving such problems which is capable of straightforwardly tuning both exploration and exploitation characteristics of an optimization procedure while retaining favorable properties of genetic algorithms. The method, dubbed MOSAiCS (Metropolis Optimization by Sampling Adaptively in Chemical Space), is tested on problems related to optimizing components of battery electrolytes, namely, minimizing solvation energy in water or maximizing dipole moment while enforcing a lower bound on the HOMO-LUMO gap; optimization was carried out over sets of molecular graphs inspired by QM9 and Electrolyte Genome Project (EGP) data sets. MOSAiCS reliably generated molecular candidates with good target quantity values, which were in most cases better than the ones found in QM9 or EGP. While the optimization results presented in this work sometimes required up to 106 QM calculations and were thus feasible only thanks to computationally efficient ab initio approximations of properties of interest, we discuss possible strategies for accelerating MOSAiCS using machine learning approaches.

3.
Science ; 381(6654): 170-175, 2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-37440654

RESUMO

Density functional theory (DFT) plays a pivotal role in chemical and materials science because of its relatively high predictive power, applicability, versatility, and computational efficiency. We review recent progress in machine learning (ML) model developments, which have relied heavily on DFT for synthetic data generation and for the design of model architectures. The general relevance of these developments is placed in a broader context for chemical and materials sciences. DFT-based ML models have reached high efficiency, accuracy, scalability, and transferability and pave the way to the routine use of successful experimental planning software within self-driving laboratories.

4.
J Am Chem Soc ; 145(10): 5899-5908, 2023 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-36862462

RESUMO

We present an intuitive and general analytical approximation estimating the energy of covalent single and double bonds between participating atoms in terms of their respective nuclear charges with just three parameters, [EAB ≈ a - bZAZB + c(ZA7/3 + ZB7/3) ]. The functional form of our expression models an alchemical atomic energy decomposition between participating atoms A and B. After calibration, reasonably accurate bond dissociation energy estimates are obtained for hydrogen-saturated diatomics composed of p-block elements coming from the same row 2 ≤ n ≤ 4 in the periodic table. Corresponding changes in bond dissociation energies due to substitution of atom B by C can be obtained via simple formulas. While being of different functional form and origin, our model is as simple and accurate as Pauling's well-known electronegativity model. Analysis indicates that the model's response in covalent bonding to variation in nuclear charge is near-linear, which is consistent with Hammett's equation.

5.
J Chem Phys ; 157(22): 221102, 2022 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-36546806

RESUMO

We use energies and forces predicted within response operator based quantum machine learning (OQML) to perform geometry optimization and transition state search calculations with legacy optimizers but without the need for subsequent re-optimization with quantum chemistry methods. For randomly sampled initial coordinates of small organic query molecules, we report systematic improvement of equilibrium and transition state geometry output as training set sizes increase. Out-of-sample SN2 reactant complexes and transition state geometries have been predicted using the LBFGS and the QST2 algorithms with an root-mean-square deviation (RMSD) of 0.16 and 0.4 Å-after training on up to 200 reactant complex relaxations and transition state search trajectories from the QMrxn20 dataset, respectively. For geometry optimizations, we have also considered relaxation paths up to 5'595 constitutional isomers with sum formula C7H10O2 from the QM9-database. Using the resulting OQML models with an LBFGS optimizer reproduces the minimum geometry with an RMSD of 0.14 Å, only using ∼6000 training points obtained from normal mode sampling along the optimization paths of the training compounds without the need for active learning. For converged equilibrium and transition state geometries, subsequent vibrational normal mode frequency analysis indicates deviation from MP2 reference results by on average 14 and 26 cm-1, respectively. While the numerical cost for OQML predictions is negligible in comparison to density functional theory or MP2, the number of steps until convergence is typically larger in either case. The success rate for reaching convergence, however, improves systematically with training set size, underscoring OQML's potential for universal applicability.


Assuntos
Algoritmos , Aprendizado de Máquina , Isomerismo
6.
J Chem Phys ; 157(16): 164109, 2022 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-36319406

RESUMO

We show that the energy of a perturbed system can be fully recovered from the unperturbed system's electron density. We derive an alchemical integral transform by parametrizing space in terms of transmutations, the chain rule, and integration by parts. Within the radius of convergence, the zeroth order yields the energy expansion at all orders, restricting the textbook statement by Wigner that the p-th order wave function derivative is necessary to describe the (2p + 1)-th energy derivative. Without the need for derivatives of the electron density, this allows us to cover entire chemical neighborhoods from just one quantum calculation instead of single systems one by one. Numerical evidence presented indicates that predictive accuracy is achieved in the range of mHa for the harmonic oscillator or the Morse potential and in the range of machine accuracy for hydrogen-like atoms. Considering isoelectronic nuclear charge variations by one proton in all multi-electron atoms from He to Ne, alchemical integral transform based estimates of the relative energy deviate by only few mHa from corresponding Hartree-Fock reference numbers.

7.
J Chem Phys ; 157(2): 024303, 2022 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-35840379

RESUMO

Equilibrium structures determine material properties and biochemical functions. We here propose to machine learn phase space averages, conventionally obtained by ab initio or force-field-based molecular dynamics (MD) or Monte Carlo (MC) simulations. In analogy to ab initio MD, our ab initio machine learning (AIML) model does not require bond topologies and, therefore, enables a general machine learning pathway to obtain ensemble properties throughout the chemical compound space. We demonstrate AIML for predicting Boltzmann averaged structures after training on hundreds of MD trajectories. The AIML output is subsequently used to train machine learning models of free energies of solvation using experimental data and to reach competitive prediction errors (mean absolute error ∼ 0.8 kcal/mol) for out-of-sample molecules-within milliseconds. As such, AIML effectively bypasses the need for MD or MC-based phase space sampling, enabling exploration campaigns of Boltzmann averages throughout the chemical compound space at a much accelerated pace. We contextualize our findings by comparison to state-of-the-art methods resulting in a Pareto plot for the free energy of solvation predictions in terms of accuracy and time.


Assuntos
Aprendizado de Máquina , Simulação de Dinâmica Molecular , Método de Monte Carlo
8.
J Chem Phys ; 156(20): 204111, 2022 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-35649833

RESUMO

Bonding energies play an essential role in describing the relative stability of molecules in chemical space. Therefore, methods employed to search chemical space need to capture the bonding behavior for a wide range of molecules, including radicals. In this work, we investigate the ability of quantum alchemy to capture the bonding behavior of hypothetical chemical compounds, specifically diatomic molecules involving hydrogen with various electronic structures. We evaluate equilibrium bond lengths, ionization energies, and electron affinities of these fundamental systems. We compare and contrast how well manual quantum alchemy calculations, i.e., quantum mechanics calculations in which the nuclear charge is altered, and quantum alchemy approximations using a Taylor series expansion can predict these molecular properties. Our results suggest that while manual quantum alchemy calculations outperform Taylor series approximations, truncations of Taylor series approximations after the second order provide the most accurate Taylor series predictions. Furthermore, these results suggest that trends in quantum alchemy predictions are generally dependent on the predicted property (i.e., equilibrium bond length, ionization energy, or electron affinity). Taken together, this work provides insight into how quantum alchemy predictions using a Taylor series expansion may be applied to future studies of non-singlet systems as well as the challenges that remain open for predicting the bonding behavior of such systems.

9.
J Chem Phys ; 156(6): 064106, 2022 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-35168341

RESUMO

Due to the sheer size of chemical and materials space, high-throughput computational screening thereof will require the development of new computational methods that are accurate, efficient, and transferable. These methods need to be applicable to electron configurations beyond ground states. To this end, we have systematically studied the applicability of quantum alchemy predictions using a Taylor series expansion on quantum mechanics (QM) calculations for single atoms with different electronic structures arising from different net charges and electron spin multiplicities. We first compare QM method accuracy to experimental quantities, including first and second ionization energies, electron affinities, and spin multiplet energy gaps, for a baseline understanding of QM reference data. Next, we investigate the intrinsic accuracy of "manual" quantum alchemy. This method uses QM calculations involving nuclear charge perturbations of one atom's basis set to model another. We then discuss the reliability of quantum alchemy based on Taylor series approximations at different orders of truncation. Overall, we find that the errors from finite basis set treatments in quantum alchemy are significantly reduced when thermodynamic cycles are employed, which highlights a route to improve quantum alchemy in explorations of chemical space. This work establishes important technical aspects that impact the accuracy of quantum alchemy predictions using a Taylor series and provides a foundation for further quantum alchemy studies.

10.
J Chem Phys ; 155(22): 224103, 2021 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-34911321

RESUMO

Doping compounds can be considered a perturbation to the nuclear charges in a molecular Hamiltonian. Expansions of this perturbation in a Taylor series, i.e., quantum alchemy, have been used in the literature to assess millions of derivative compounds at once rather than enumerating them in costly quantum chemistry calculations. So far, it was unclear whether this series even converges for small molecules, whether it can be used for geometry relaxation, and how strong this perturbation may be to still obtain convergent numbers. This work provides numerical evidence that this expansion converges and recovers the self-consistent energy of Hartree-Fock calculations. The convergence radius of this expansion is quantified for dimer examples and systematically evaluated for different basis sets, allowing for estimates of the chemical space that can be covered by perturbing one reference calculation alone. Besides electronic energy, convergence is shown for density matrix elements, molecular orbital energies, and density profiles, even for large changes in electronic structure, e.g., transforming He3 into H6. Subsequently, mixed alchemical and spatial derivatives are used to relax H2 from the electronic structure of He alone, highlighting a path to spatially relaxed quantum alchemy. Finally, the underlying code that allows for arbitrarily accurate evaluation of restricted Hartree-Fock energies and arbitrary order derivatives is made available to support future method development.

11.
J Chem Phys ; 155(6): 064105, 2021 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-34391351

RESUMO

The interplay of kinetics and thermodynamics governs reactive processes, and their control is key in synthesis efforts. While sophisticated numerical methods for studying equilibrium states have well advanced, quantitative predictions of kinetic behavior remain challenging. We introduce a reactant-to-barrier (R2B) machine learning model that rapidly and accurately infers activation energies and transition state geometries throughout the chemical compound space. R2B exhibits improving accuracy as training set sizes grow and requires as input solely the molecular graph of the reactant and the information of the reaction type. We provide numerical evidence for the applicability of R2B for two competing text-book reactions relevant to organic synthesis, E2 and SN2, trained and tested on chemically diverse quantum data from the literature. After training on 1-1.8k examples, R2B predicts activation energies on average within less than 2.5 kcal/mol with respect to the coupled-cluster singles doubles reference within milliseconds. Principal component analysis of kernel matrices reveals the hierarchy of the multiple scales underpinning reactivity in chemical space: Nucleophiles and leaving groups, substituents, and pairwise substituent combinations correspond to systematic lowering of eigenvalues. Analysis of R2B based predictions of ∼11.5k E2 and SN2 barriers in the gas-phase for previously undocumented reactants indicates that on average, E2 is favored in 75% of all cases and that SN2 becomes likely for chlorine as nucleophile/leaving group and for substituents consisting of hydrogen or electron-withdrawing groups. Experimental reaction design from first principles is enabled due to R2B, which is demonstrated by the construction of decision trees. Numerical R2B based results for interatomic distances and angles of reactant and transition state geometries suggest that Hammond's postulate is applicable to SN2, but not to E2.

12.
Nat Commun ; 12(1): 4468, 2021 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-34294693

RESUMO

The computational prediction of atomistic structure is a long-standing problem in physics, chemistry, materials, and biology. Conventionally, force-fields or ab initio methods determine structure through energy minimization, which is either approximate or computationally demanding. This accuracy/cost trade-off prohibits the generation of synthetic big data sets accounting for chemical space with atomistic detail. Exploiting implicit correlations among relaxed structures in training data sets, our machine learning model Graph-To-Structure (G2S) generalizes across compound space in order to infer interatomic distances for out-of-sample compounds, effectively enabling the direct reconstruction of coordinates, and thereby bypassing the conventional energy optimization task. The numerical evidence collected includes 3D coordinate predictions for organic molecules, transition states, and crystalline solids. G2S improves systematically with training set size, reaching mean absolute interatomic distance prediction errors of less than 0.2 Å for less than eight thousand training structures - on par or better than conventional structure generators. Applicability tests of G2S include successful predictions for systems which typically require manual intervention, improved initial guesses for subsequent conventional ab initio based relaxation, and input generation for subsequent use of structure based quantum machine learning models.

13.
Environ Sci Technol ; 55(12): 8447-8457, 2021 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-34080853

RESUMO

Brown carbon (BrC) is involved in atmospheric light absorption and climate forcing and can cause adverse health effects. Understanding the formation mechanisms and molecular structure of BrC is of key importance in developing strategies to control its environment and health impact. Structure determination of BrC is challenging, due to the lack of experiments providing molecular fingerprints and the sheer number of molecular candidates with identical mass. Suggestions based on chemical intuition are prone to errors due to the inherent bias. We present an unbiased algorithm, using graph-based molecule generation and machine learning, which can identify all molecular structures of compounds involved in biomass burning and the composition of BrC. We apply this algorithm to C12H12O7, a light-absorbing "test case" molecule identified in chamber experiments on the aqueous photo-oxidation of syringol, a prevalent marker in wood smoke. Of the 260 million molecular graphs, the algorithm leaves only 36,518 (0.01%) as viable candidates matching the spectrum. Although no unique molecular structure is obtained from only a chemical formula and a UV/vis absorption spectrum, we discuss further reduction strategies and their efficacy. With additional data, the method can potentially more rapidly identify isomers extracted from lab and field aerosol particles without introducing human bias.


Assuntos
Carbono , Intuição , Aerossóis , Biomassa , Humanos , Aprendizado de Máquina
14.
Sci Adv ; 7(21)2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-34138735

RESUMO

Brute-force compute campaigns relying on demanding ab initio calculations routinely search for previously unknown materials in chemical compound space (CCS), the vast set of all conceivable stable combinations of elements and structural configurations. Here, we demonstrate that four-dimensional chirality arising from antisymmetry of alchemical perturbations dissects CCS and defines approximate ranks, which reduce its formal dimensionality and break down its combinatorial scaling. The resulting "alchemical" enantiomers have the same electronic energy up to the third order, independent of respective covalent bond topology, imposing relevant constraints on chemical bonding. Alchemical chirality deepens our understanding of CCS and enables the establishment of trends without empiricism for any materials with fixed lattices. We demonstrate the efficacy for three cases: (i) new rules for electronic energy contributions to chemical bonding; (ii) analysis of the electron density of BN-doped benzene; and (iii) ranking over 2000 and 4 million BN-doped naphthalene and picene derivatives, respectively.

15.
J Chem Phys ; 153(14): 144118, 2020 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-33086815

RESUMO

Alchemical perturbation density functional theory has been shown to be an efficient and computationally inexpensive way to explore chemical compound space. We investigate approximations made, in terms of atomic basis sets and the perturbation order, introduce an electron-density based estimate of errors of the alchemical prediction, and propose a correction for effects due to basis set incompleteness. Our numerical analysis of potential energy estimates, and resulting binding curves, is based on coupled-cluster single double (CCSD) reference results and is limited to all neutral diatomics with 14 electrons (AlH⋯NN). The method predicts binding energy, equilibrium distance, and vibrational frequencies of neighboring out-of-sample diatomics with near CCSD quality using perturbations up to the fifth order. We also discuss simultaneous alchemical mutations at multiple sites in benzene.

16.
Phys Chem Chem Phys ; 22(19): 10519-10525, 2020 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-31960870

RESUMO

We assess the applicability of alchemical perturbation density functional theory (APDFT) for quickly and accurately estimating deprotonation energies. We have considered all possible single and double deprotonations in one hundred small organic molecules drawn at random from QM9 [Ramakrishnan et al., JCTC, 2015]. Numerical evidence is presented for 5160 deprotonated species at both HF/def2-TZVP and CCSD/6-31G* levels of theory. We show that the perturbation expansion formalism of APDFT quickly converges to reliable results: using CCSD electron densities and derivatives, regular Hartree-Fock calculations are outperformed at the second or third order for ranking all possible doubly or singly deprotonated molecules, respectively. CCSD single deprotonation energies are reproduced within 1.4 kcal mol-1 on average within third order APDFT. We introduce a hybrid approach where the computational cost of APDFT is reduced even further by mixing first order terms at a higher level of theory (CCSD) with higher order terms at a lower level of theory only (HF). We find that this approach reaches 2 kcal mol-1 accuracy in absolute deprotonation energies compared to CCSD at 2% of the computational cost of third order APDFT.

17.
Chem Sci ; 11(43): 11859-11868, 2020 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-34094415

RESUMO

It is intriguing how the Hammett equation enables control of chemical reactivity throughout chemical space by separating the effect of substituents from chemical process variables, such as reaction mechanism, solvent, or temperature. We generalize Hammett's original approach to predict potential energies of activation in non aromatic molecular scaffolds with multiple substituents. We use global regression to optimize Hammett parameters ρ and σ in two experimental datasets (rate constants for benzylbromides reacting with thiols and ammonium salt decomposition), as well as in a synthetic dataset consisting of computational activation energies of ∼2400 SN2 reactions, with various nucleophiles and leaving groups (-H, -F, -Cl, -Br) and functional groups (-H, -NO2, -CN, -NH3, -CH3). Individual substituents contribute additively to molecular σ with a unique regression term, which quantifies the inductive effect. The position dependence of substituents can be modeled by a distance decaying factor for SN2. Use of the Hammett equation as a base-line model for Δ-machine learning models of the activation energy in chemical space results in substantially improved learning curves reaching low prediction errors for small training sets.

18.
J Phys Chem B ; 123(47): 10073-10082, 2019 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-31647233

RESUMO

Based on thermodynamic integration, we introduce atoms in molecules (AIM) using the orbital-free framework of alchemical perturbation density functional theory (APDFT). Within APDFT, atomic energies and electron densities in molecules are arbitrary because any reference system and integration path can be selected as long as it meets the boundary conditions. We choose the uniform electron gas (jellium) as a reference and linearly scale up all nuclear charges, situated at any query molecule's atomic coordinates. Within the approximations made when calculating one-particle electron densities, this universal choice affords unambiguous and exact definitions of energies and electron densities of AIMs. Numerical results are presented for neutral small molecules (CO, N2, BF, CO2), various small molecules with different electronic hybridization states of carbon (CH4, C2H6, C2H4, C2H2, HCN), and all of the possible BN-doped mutants connecting benzene to borazine (C2nB3-nN3-nH6, 0 ≤ n ≤ 3). Our results, as well as comparison to atomic energy estimates resulting from either DFT trained neural network models or atomic basis set overlap within CCSD, suggest that APDFT based AIMs enable meaningful, interesting, and counterintuitive interpretations of chemical bonding and molecular electron densities.

19.
J Phys Chem Lett ; 9(18): 5574-5582, 2018 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-30180586

RESUMO

The interface between transition metal oxides (TMO) and liquid water plays a crucial role in environmental chemistry, catalysis, and energy science. Yet, the mechanism and energetics of chemical transformations at solvated TMO surfaces is often unclear, largely because of the difficulty to characterize the active surface species experimentally. The hematite (α-Fe2O3)-liquid water interface is a case in point. Here we demonstrate that ab initio molecular dynamics is a viable tool for determining the protonation states of complex interfaces. The p Ka values of the oxygen-terminated (001) surface group of hematite, ≡OH, and half-layer terminated (012) surface groups, ≡2OH and ≡1OH2, are predicted to be (18.5 ± 0.3), (18.9 ± 0.6), and (10.3 ± 0.5) p Ka units, respectively. These are in good agreement with recent bond-valence theory based estimates, and suggest that the deprotonation of these surfaces require significantly more free energy input than previously thought.

20.
J Chem Theory Comput ; 13(5): 2178-2184, 2017 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-28350956

RESUMO

Density functional theory-based molecular dynamics calculations of condensed phase systems often benefit from the use of hybrid functionals. However, their use is computationally very demanding and severely limits the system size and time scale that can be simulated. Several methods have been introduced to accelerate hybrid functional molecular dynamics including Schwarz screening and the auxiliary density matrix method (ADMM). Here we present a simple screening scheme that can be applied in addition to these methods. It works by examining Hartree-Fock exchange (HFX) integrals and subsequently excluding those that contribute very little to any nuclear force component. The resultant force error is corrected by a history-dependent extrapolation scheme. We find that for systems where the calculation of HFX forces is a major bottleneck, a large fraction of the integrals can be neglected without introducing significant errors in the nuclear forces. For instance, for a 2 × 2 × 2 unit cell of CoO, 92% of the HFX integrals that have passed Schwarz screening within the ADMM approach can be neglected leading to a performance gain of a factor of 3 at a negligible error in nuclear forces (≤5 × 10-4 H bohr-1). We also show that total energy conservation and solvation structures are not adversely affected by the screening method.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA