Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 84
Filtrar
1.
Phys Chem Chem Phys ; 26(5): 4306-4319, 2024 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-38234256

RESUMO

The efficiency of machine learning algorithms for electronically excited states is far behind ground-state applications. One of the underlying problems is the insufficient smoothness of the fitted potential energy surfaces and other properties in the vicinity of state crossings and conical intersections, which is a prerequisite for an efficient regression. Smooth surfaces can be obtained by switching to the diabatic basis. However, diabatization itself is still an outstanding problem. We overcome these limitations by solving both problems at once. We use a machine learning approach combining clustering and regression techniques to correct for the deficiencies of property-based diabatization which, in return, provides us with smooth surfaces that can be easily fitted. Our approach extends the applicability of property-based diabatization to multidimensional systems. We utilize the proposed diabatization scheme to achieve higher prediction accuracy for adiabatic states and we show its performance by reconstructing global potential energy surfaces of excited states of nitrosyl fluoride and formaldehyde. While the proposed methodology is independent of the specific property-based diabatization and regression algorithm, we show its performance for kernel ridge regression and a very simple diabatization based on transition multipoles. Compared to most other algorithms based on machine learning, our approach needs only a small amount of training data.

2.
J Chem Phys ; 160(5)2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38341696

RESUMO

We study alchemical atomic energy partitioning as a method to estimate atomization energies from atomic contributions, which are defined in physically rigorous and general ways through the use of the uniform electron gas as a joint reference. We analyze quantitatively the relation between atomic energies and their local environment using a dataset of 1325 organic molecules. The atomic energies are transferable across various molecules, enabling the prediction of atomization energies with a mean absolute error of 23 kcal/mol, comparable to simple statistical estimates but potentially more robust given their grounding in the physics-based decomposition scheme. A comparative analysis with other decomposition methods highlights its sensitivity to electrostatic variations, underlining its potential as a representation of the environment as well as in studying processes like diffusion in solids characterized by significant electrostatic shifts.

3.
J Am Chem Soc ; 145(10): 5899-5908, 2023 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-36862462

RESUMO

We present an intuitive and general analytical approximation estimating the energy of covalent single and double bonds between participating atoms in terms of their respective nuclear charges with just three parameters, [EAB ≈ a - bZAZB + c(ZA7/3 + ZB7/3) ]. The functional form of our expression models an alchemical atomic energy decomposition between participating atoms A and B. After calibration, reasonably accurate bond dissociation energy estimates are obtained for hydrogen-saturated diatomics composed of p-block elements coming from the same row 2 ≤ n ≤ 4 in the periodic table. Corresponding changes in bond dissociation energies due to substitution of atom B by C can be obtained via simple formulas. While being of different functional form and origin, our model is as simple and accurate as Pauling's well-known electronegativity model. Analysis indicates that the model's response in covalent bonding to variation in nuclear charge is near-linear, which is consistent with Hammett's equation.

4.
Chem Rev ; 121(16): 10001-10036, 2021 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-34387476

RESUMO

Chemical compound space (CCS), the set of all theoretically conceivable combinations of chemical elements and (meta-)stable geometries that make up matter, is colossal. The first-principles based virtual sampling of this space, for example, in search of novel molecules or materials which exhibit desirable properties, is therefore prohibitive for all but the smallest subsets and simplest properties. We review studies aimed at tackling this challenge using modern machine learning techniques based on (i) synthetic data, typically generated using quantum mechanics based methods, and (ii) model architectures inspired by quantum mechanics. Such Quantum mechanics based Machine Learning (QML) approaches combine the numerical efficiency of statistical surrogate models with an ab initio view on matter. They rigorously reflect the underlying physics in order to reach universality and transferability across CCS. While state-of-the-art approximations to quantum problems impose severe computational bottlenecks, recent QML based developments indicate the possibility of substantial acceleration without sacrificing the predictive power of quantum mechanics.


Assuntos
Compostos Inorgânicos/química , Aprendizado de Máquina , Compostos Orgânicos/química , Teoria Quântica
5.
Phys Chem Chem Phys ; 25(20): 13933-13945, 2023 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-37190820

RESUMO

Recent advances in experimental methodology enabled studies of the quantum-state- and conformational dependence of chemical reactions under precisely controlled conditions in the gas phase. Here, we generated samples of selected gauche and s-trans 2,3-dibromobutadiene (DBB) by electrostatic deflection in a molecular beam and studied their reaction with Coulomb crystals of laser-cooled Ca+ ions in an ion trap. The rate coefficients for the total reaction were found to strongly depend on both the conformation of DBB and the electronic state of Ca+. In the (4p)2P1/2 and (3d)2D3/2 excited states of Ca+, the reaction is capture-limited and faster for the gauche conformer due to long-range ion-dipole interactions. In the (4s)2S1/2 ground state of Ca+, the reaction rate for s-trans DBB still conforms with the capture limit, while that for gauche DBB is strongly suppressed. The experimental observations were analysed with the help of adiabatic capture theory, ab initio calculations and reactive molecular dynamics simulations on a machine-learned full-dimensional potential energy surface of the system. The theory yields near-quantitative agreement for s-trans-DBB, but overestimates the reactivity of the gauche-conformer compared to the experiment. The present study points to the important role of molecular geometry even in strongly reactive exothermic systems and illustrates striking differences in the reactivity of individual conformers in gas-phase ion-molecule reactions.

6.
J Chem Phys ; 159(3)2023 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-37462285

RESUMO

The feature vector mapping used to represent chemical systems is a key factor governing the superior data efficiency of kernel based quantum machine learning (QML) models applicable throughout chemical compound space. Unfortunately, the most accurate representations require a high dimensional feature mapping, thereby imposing a considerable computational burden on model training and use. We introduce compact yet accurate, linear scaling QML representations based on atomic Gaussian many-body distribution functionals (MBDF) and their derivatives. Weighted density functions of MBDF values are used as global representations that are constant in size, i.e., invariant with respect to the number of atoms. We report predictive performance and training data efficiency that is competitive with state-of-the-art for two diverse datasets of organic molecules, QM9 and QMugs. Generalization capability has been investigated for atomization energies, highest occupied molecular orbital-lowest unoccupied molecular orbital eigenvalues and gap, internal energies at 0 K, zero point vibrational energies, dipole moment norm, static isotropic polarizability, and heat capacity as encoded in QM9. MBDF based QM9 performance lowers the optimal Pareto front spanned between sampling and training cost to compute node minutes, effectively sampling chemical compound space with chemical accuracy at a sampling rate of ∼48 molecules per core second.

7.
J Chem Phys ; 156(18): 184801, 2022 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-35568550

RESUMO

We propose the relaxation of geometries throughout chemical compound space using alchemical perturbation density functional theory (APDFT). APDFT refers to perturbation theory involving changes in nuclear charges within approximate solutions to Schrödinger's equation. We give an analytical formula to calculate the mixed second order energy derivatives with respect to both nuclear charges and nuclear positions (named "alchemical force") within the restricted Hartree-Fock case. We have implemented and studied the formula for its use in geometry relaxation of various reference and target molecules. We have also analyzed the convergence of the alchemical force perturbation series as well as basis set effects. Interpolating alchemically predicted energies, forces, and Hessian to a Morse potential yields more accurate geometries and equilibrium energies than when performing a standard Newton-Raphson step. Our numerical predictions for small molecules including BF, CO, N2, CH4, NH3, H2O, and HF yield mean absolute errors of equilibrium energies and bond lengths smaller than 10 mHa and 0.01 bohr for fourth order APDFT predictions, respectively. Our alchemical geometry relaxation still preserves the combinatorial efficiency of APDFT: Based on a single coupled perturbed Hartree-Fock derivative for benzene, we provide numerical predictions of equilibrium energies and relaxed structures of all 17 iso-electronic charge-neutral BN-doped mutants with averaged absolute deviations of ∼27 mHa and ∼0.12 bohr, respectively.


Assuntos
Fenômenos Físicos
8.
J Chem Phys ; 156(11): 114101, 2022 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-35317562

RESUMO

We introduce an electronic structure based representation for quantum machine learning (QML) of electronic properties throughout chemical compound space. The representation is constructed using computationally inexpensive ab initio calculations and explicitly accounts for changes in the electronic structure. We demonstrate the accuracy and flexibility of resulting QML models when applied to property labels, such as total potential energy, HOMO and LUMO energies, ionization potential, and electron affinity, using as datasets for training and testing entries from the QM7b, QM7b-T, QM9, and LIBE libraries. For the latter, we also demonstrate the ability of this approach to account for molecular species of different charge and spin multiplicity, resulting in QML models that infer total potential energies based on geometry, charge, and spin as input.

9.
J Chem Phys ; 157(2): 024303, 2022 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-35840379

RESUMO

Equilibrium structures determine material properties and biochemical functions. We here propose to machine learn phase space averages, conventionally obtained by ab initio or force-field-based molecular dynamics (MD) or Monte Carlo (MC) simulations. In analogy to ab initio MD, our ab initio machine learning (AIML) model does not require bond topologies and, therefore, enables a general machine learning pathway to obtain ensemble properties throughout the chemical compound space. We demonstrate AIML for predicting Boltzmann averaged structures after training on hundreds of MD trajectories. The AIML output is subsequently used to train machine learning models of free energies of solvation using experimental data and to reach competitive prediction errors (mean absolute error ∼ 0.8 kcal/mol) for out-of-sample molecules-within milliseconds. As such, AIML effectively bypasses the need for MD or MC-based phase space sampling, enabling exploration campaigns of Boltzmann averages throughout the chemical compound space at a much accelerated pace. We contextualize our findings by comparison to state-of-the-art methods resulting in a Pareto plot for the free energy of solvation predictions in terms of accuracy and time.


Assuntos
Aprendizado de Máquina , Simulação de Dinâmica Molecular , Método de Monte Carlo
10.
J Chem Phys ; 157(16): 164109, 2022 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-36319406

RESUMO

We show that the energy of a perturbed system can be fully recovered from the unperturbed system's electron density. We derive an alchemical integral transform by parametrizing space in terms of transmutations, the chain rule, and integration by parts. Within the radius of convergence, the zeroth order yields the energy expansion at all orders, restricting the textbook statement by Wigner that the p-th order wave function derivative is necessary to describe the (2p + 1)-th energy derivative. Without the need for derivatives of the electron density, this allows us to cover entire chemical neighborhoods from just one quantum calculation instead of single systems one by one. Numerical evidence presented indicates that predictive accuracy is achieved in the range of mHa for the harmonic oscillator or the Morse potential and in the range of machine accuracy for hydrogen-like atoms. Considering isoelectronic nuclear charge variations by one proton in all multi-electron atoms from He to Ne, alchemical integral transform based estimates of the relative energy deviate by only few mHa from corresponding Hartree-Fock reference numbers.

11.
J Chem Phys ; 157(22): 221102, 2022 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-36546806

RESUMO

We use energies and forces predicted within response operator based quantum machine learning (OQML) to perform geometry optimization and transition state search calculations with legacy optimizers but without the need for subsequent re-optimization with quantum chemistry methods. For randomly sampled initial coordinates of small organic query molecules, we report systematic improvement of equilibrium and transition state geometry output as training set sizes increase. Out-of-sample SN2 reactant complexes and transition state geometries have been predicted using the LBFGS and the QST2 algorithms with an root-mean-square deviation (RMSD) of 0.16 and 0.4 Å-after training on up to 200 reactant complex relaxations and transition state search trajectories from the QMrxn20 dataset, respectively. For geometry optimizations, we have also considered relaxation paths up to 5'595 constitutional isomers with sum formula C7H10O2 from the QM9-database. Using the resulting OQML models with an LBFGS optimizer reproduces the minimum geometry with an RMSD of 0.14 Å, only using ∼6000 training points obtained from normal mode sampling along the optimization paths of the training compounds without the need for active learning. For converged equilibrium and transition state geometries, subsequent vibrational normal mode frequency analysis indicates deviation from MP2 reference results by on average 14 and 26 cm-1, respectively. While the numerical cost for OQML predictions is negligible in comparison to density functional theory or MP2, the number of steps until convergence is typically larger in either case. The success rate for reaching convergence, however, improves systematically with training set size, underscoring OQML's potential for universal applicability.


Assuntos
Algoritmos , Aprendizado de Máquina , Isomerismo
12.
Environ Sci Technol ; 55(12): 8447-8457, 2021 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-34080853

RESUMO

Brown carbon (BrC) is involved in atmospheric light absorption and climate forcing and can cause adverse health effects. Understanding the formation mechanisms and molecular structure of BrC is of key importance in developing strategies to control its environment and health impact. Structure determination of BrC is challenging, due to the lack of experiments providing molecular fingerprints and the sheer number of molecular candidates with identical mass. Suggestions based on chemical intuition are prone to errors due to the inherent bias. We present an unbiased algorithm, using graph-based molecule generation and machine learning, which can identify all molecular structures of compounds involved in biomass burning and the composition of BrC. We apply this algorithm to C12H12O7, a light-absorbing "test case" molecule identified in chamber experiments on the aqueous photo-oxidation of syringol, a prevalent marker in wood smoke. Of the 260 million molecular graphs, the algorithm leaves only 36,518 (0.01%) as viable candidates matching the spectrum. Although no unique molecular structure is obtained from only a chemical formula and a UV/vis absorption spectrum, we discuss further reduction strategies and their efficacy. With additional data, the method can potentially more rapidly identify isomers extracted from lab and field aerosol particles without introducing human bias.


Assuntos
Carbono , Intuição , Aerossóis , Biomassa , Humanos , Aprendizado de Máquina
13.
J Chem Phys ; 155(6): 064105, 2021 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-34391351

RESUMO

The interplay of kinetics and thermodynamics governs reactive processes, and their control is key in synthesis efforts. While sophisticated numerical methods for studying equilibrium states have well advanced, quantitative predictions of kinetic behavior remain challenging. We introduce a reactant-to-barrier (R2B) machine learning model that rapidly and accurately infers activation energies and transition state geometries throughout the chemical compound space. R2B exhibits improving accuracy as training set sizes grow and requires as input solely the molecular graph of the reactant and the information of the reaction type. We provide numerical evidence for the applicability of R2B for two competing text-book reactions relevant to organic synthesis, E2 and SN2, trained and tested on chemically diverse quantum data from the literature. After training on 1-1.8k examples, R2B predicts activation energies on average within less than 2.5 kcal/mol with respect to the coupled-cluster singles doubles reference within milliseconds. Principal component analysis of kernel matrices reveals the hierarchy of the multiple scales underpinning reactivity in chemical space: Nucleophiles and leaving groups, substituents, and pairwise substituent combinations correspond to systematic lowering of eigenvalues. Analysis of R2B based predictions of ∼11.5k E2 and SN2 barriers in the gas-phase for previously undocumented reactants indicates that on average, E2 is favored in 75% of all cases and that SN2 becomes likely for chlorine as nucleophile/leaving group and for substituents consisting of hydrogen or electron-withdrawing groups. Experimental reaction design from first principles is enabled due to R2B, which is demonstrated by the construction of decision trees. Numerical R2B based results for interatomic distances and angles of reactant and transition state geometries suggest that Hammond's postulate is applicable to SN2, but not to E2.

14.
J Chem Phys ; 154(13): 134113, 2021 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-33832231

RESUMO

Free energies govern the behavior of soft and liquid matter, and improving their predictions could have a large impact on the development of drugs, electrolytes, or homogeneous catalysts. Unfortunately, it is challenging to devise an accurate description of effects governing solvation such as hydrogen-bonding, van der Waals interactions, or conformational sampling. We present a Free energy Machine Learning (FML) model applicable throughout chemical compound space and based on a representation that employs Boltzmann averages to account for an approximated sampling of configurational space. Using the FreeSolv database, FML's out-of-sample prediction errors of experimental hydration free energies decay systematically with training set size, and experimental uncertainty (0.6 kcal/mol) is reached after training on 490 molecules (80% of FreeSolv). Corresponding FML model errors are on par with state-of-the art physics based approaches. To generate the input representation for a new query compound, FML requires approximate and short molecular dynamics runs. We showcase its usefulness through analysis of solvation free energies for 116k organic molecules (all force-field compatible molecules in the QM9 database), identifying the most and least solvated systems and rediscovering quasi-linear structure-property relationships in terms of simple descriptors such as hydrogen-bond donors, number of NH or OH groups, number of oxygen atoms in hydrocarbons, and number of heavy atoms. FML's accuracy is maximal when the temperature used for the molecular dynamics simulation to generate averaged input representation samples in training is the same as for the query compounds. The sampling time for the representation converges rapidly with respect to the prediction error.

15.
Phys Chem Chem Phys ; 22(19): 10519-10525, 2020 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-31960870

RESUMO

We assess the applicability of alchemical perturbation density functional theory (APDFT) for quickly and accurately estimating deprotonation energies. We have considered all possible single and double deprotonations in one hundred small organic molecules drawn at random from QM9 [Ramakrishnan et al., JCTC, 2015]. Numerical evidence is presented for 5160 deprotonated species at both HF/def2-TZVP and CCSD/6-31G* levels of theory. We show that the perturbation expansion formalism of APDFT quickly converges to reliable results: using CCSD electron densities and derivatives, regular Hartree-Fock calculations are outperformed at the second or third order for ranking all possible doubly or singly deprotonated molecules, respectively. CCSD single deprotonation energies are reproduced within 1.4 kcal mol-1 on average within third order APDFT. We introduce a hybrid approach where the computational cost of APDFT is reduced even further by mixing first order terms at a higher level of theory (CCSD) with higher order terms at a lower level of theory only (HF). We find that this approach reaches 2 kcal mol-1 accuracy in absolute deprotonation energies compared to CCSD at 2% of the computational cost of third order APDFT.

16.
Phys Chem Chem Phys ; 22(24): 13431-13439, 2020 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-32515452

RESUMO

The Diels-Alder cycloaddition, in which a diene reacts with a dienophile to form a cyclic compound, counts among the most important tools in organic synthesis. Achieving a precise understanding of its mechanistic details on the quantum level requires new experimental and theoretical methods. Here, we present an experimental approach that separates different diene conformers in a molecular beam as a prerequisite for the investigation of their individual cycloaddition reaction kinetics and dynamics under single-collision conditions in the gas phase. A low- and high-level quantum-chemistry-based screening of more than one hundred dienes identified 2,3-dibromobutadiene (DBB) as an optimal candidate for efficient separation of its gauche and s-trans conformers by electrostatic deflection. A preparation method for DBB was developed which enabled the generation of dense molecular beams of this compound. The theoretical predictions of the molecular properties of DBB were validated by the successful separation of the conformers in the molecular beam. A marked difference in photofragment ion yields of the two conformers upon femtosecond-laser pulse ionization was observed, pointing at a pronounced conformer-specific fragmentation dynamics of ionized DBB. Our work sets the stage for a rigorous examination of mechanistic models of cycloaddition reactions under controlled conditions in the gas phase.

17.
J Phys Chem A ; 124(42): 8853-8865, 2020 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-32970440

RESUMO

Machine learning (ML) has become a promising tool for improving the quality of atomistic simulations. Using formaldehyde as a benchmark system for intramolecular interactions, a comparative assessment of ML models based on state-of-the-art variants of deep neural networks (NNs), reproducing kernel Hilbert space (RKHS+F), and kernel ridge regression (KRR) is presented. Learning curves for energies and atomic forces indicate rapid convergence toward excellent predictions for B3LYP, MP2, and CCSD(T)-F12 reference results for modestly sized (in the hundreds) training sets. Typically, learning curve offsets decay as one goes from NN (PhysNet) to RKHS+F to KRR (FCHL). Conversely, the predictive power for extrapolation of energies toward new geometries increases in the same order with RKHS+F and FCHL performing almost equally. For harmonic vibrational frequencies, the picture is less clear, with PhysNet and FCHL yielding accuracies of ∼1 and ∼0.2 cm-1, respectively, no matter which reference method, while RKHS+F models level off for B3LYP and exhibit continued improvements for MP2 and CCSD(T)-F12. Finite-temperature molecular dynamics (MD) simulations using the PESs from the three ML methods with identical initial conditions yield indistinguishable infrared spectra with good performance compared with experiment except for the high-frequency modes involving hydrogen stretch motion which is a known limitation of MD for vibrational spectroscopy. For sufficiently large training set sizes, all three models can detect insufficient convergence ("noise") of the reference electronic structure calculations in that the learning curves level off. Transfer learning (TL) from B3LYP to CCSD(T)-F12 with PhysNet indicates that additional improvements in data efficiency can be achieved.

18.
J Chem Phys ; 153(14): 144118, 2020 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-33086815

RESUMO

Alchemical perturbation density functional theory has been shown to be an efficient and computationally inexpensive way to explore chemical compound space. We investigate approximations made, in terms of atomic basis sets and the perturbation order, introduce an electron-density based estimate of errors of the alchemical prediction, and propose a correction for effects due to basis set incompleteness. Our numerical analysis of potential energy estimates, and resulting binding curves, is based on coupled-cluster single double (CCSD) reference results and is limited to all neutral diatomics with 14 electrons (AlH⋯NN). The method predicts binding energy, equilibrium distance, and vibrational frequencies of neighboring out-of-sample diatomics with near CCSD quality using perturbations up to the fifth order. We also discuss simultaneous alchemical mutations at multiple sites in benzene.

19.
J Chem Phys ; 150(6): 064105, 2019 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-30769998

RESUMO

The role of response operators is well established in quantum mechanics. We investigate their use for universal quantum machine learning models of response properties in molecules. After introducing a theoretical basis, we present and discuss numerical evidence based on measuring the potential energy's response with respect to atomic displacement and to electric fields. Prediction errors for corresponding properties, atomic forces, and dipole moments improve in a systematic fashion with training set size and reach high accuracy for small training sets. Prediction of normal modes and infrared-spectra of some small molecules demonstrates the usefulness of this approach for chemistry.

20.
Chimia (Aarau) ; 73(12): 1028-1031, 2019 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-31883556

RESUMO

The identification and use of structure-property relationships lies at the heart of the chemical sciences. Quantum mechanics forms the basis for the unbiased virtual exploration of chemical compound space (CCS), imposing substantial compute needs if chemical accuracy is to be reached. In order to accelerate predictions of quantum properties without compromising accuracy, our lab has been developing quantum machine learning (QML) based models which can be applied throughout CCS. Here, we briefly explain, review, and discuss the recently introduced operator formalism which substantially improves the data efficiency for QML models of common response properties.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA