Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 46
Add more filters

Publication year range
Nat Chem ; 16(5): 727-734, 2024 May.
Article in English | MEDLINE | ID: mdl-38454071


Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP that is applicable to a broad range of reactive chemistry without the need for refitting. Here we develop a general reactive MLIP (ANI-1xnr) through automated sampling of condensed-phase reactions. ANI-1xnr is then applied to study five distinct systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early earth small molecules. In all studies, ANI-1xnr closely matches experiment (when available) and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N and O elements in the condensed phase, enabling high-throughput in silico reactive chemistry experimentation.

J Chem Theory Comput ; 20(3): 1274-1281, 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38307009


Methodologies for training machine learning potentials (MLPs) with quantum-mechanical simulation data have recently seen tremendous progress. Experimental data have a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a training procedure based on iterative Boltzmann inversion that produces a pair potential correction to an existing MLP using equilibrium radial distribution function data. By applying these corrections to an MLP for pure aluminum based on density functional theory, we observe that the resulting model largely addresses previous overstructuring in the melt phase. Interestingly, the corrected MLP also exhibits improved performance in predicting experimental diffusion constants, which are not included in the training procedure. The presented method does not require autodifferentiating through a molecular dynamics solver and does not make assumptions about the MLP architecture. Our results suggest a practical framework for incorporating experimental data into machine learning models to improve the accuracy of molecular dynamics simulations.

Phys Rev E ; 109(1-2): 015302, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38366449


The rational function approximation provides a natural and interpretable representation of response functions such as the many-body spectral functions. We apply the vector fitting (VFIT) algorithm to fit a variety of spectral functions calculated from the Holstein model of electron-phonon interactions. We show that the resulting rational functions are highly efficient in their fitting of sharp features in the spectral functions, and could provide a means to infer physically relevant information from a spectral data set. The position of the peaks in the approximated spectral function are determined by the location of poles in the complex plane. In addition, we developed a variant of VFIT that incorporates regularization to improve the quality of fits. With this procedure, we demonstrate it is possible to achieve accurate spectral function fits that vary smoothly as a function of physical conditions.

J Chem Theory Comput ; 20(2): 891-901, 2024 Jan 23.
Article in English | MEDLINE | ID: mdl-38168674


A light-matter hybrid quasiparticle, called a polariton, is formed when molecules are strongly coupled to an optical cavity. Recent experiments have shown that polariton chemistry can manipulate chemical reactions. Polariton chemistry is a collective phenomenon, and its effects increase with the number of molecules in a cavity. However, simulating an ensemble of molecules in the excited state coupled to a cavity mode is theoretically and computationally challenging. Recent advances in machine learning (ML) techniques have shown promising capabilities in modeling ground-state chemical systems. This work presents a general protocol to predict excited-state properties, such as energies, transition dipoles, and nonadiabatic coupling vectors with the hierarchically interacting particle neural network. ML predictions are then applied to compute the potential energy surfaces and electronic spectra of a prototype azomethane molecule in the collective coupling scenario. These computational tools provide a much-needed framework to model and understand many molecules' emerging excited-state polariton chemistry.

J Chem Phys ; 159(11)2023 Sep 21.
Article in English | MEDLINE | ID: mdl-37712780


Catalyzed by enormous success in the industrial sector, many research programs have been exploring data-driven, machine learning approaches. Performance can be poor when the model is extrapolated to new regions of chemical space, e.g., new bonding types, new many-body interactions. Another important limitation is the spatial locality assumption in model architecture, and this limitation cannot be overcome with larger or more diverse datasets. The outlined challenges are primarily associated with the lack of electronic structure information in surrogate models such as interatomic potentials. Given the fast development of machine learning and computational chemistry methods, we expect some limitations of surrogate models to be addressed in the near future; nevertheless spatial locality assumption will likely remain a limiting factor for their transferability. Here, we suggest focusing on an equally important effort-design of physics-informed models that leverage the domain knowledge and employ machine learning only as a corrective tool. In the context of material science, we will focus on semi-empirical quantum mechanics, using machine learning to predict corrections to the reduced-order Hamiltonian model parameters. The resulting models are broadly applicable, retain the speed of semiempirical chemistry, and frequently achieve accuracy on par with much more expensive ab initio calculations. These early results indicate that future work, in which machine learning and quantum chemistry methods are developed jointly, may provide the best of all worlds for chemistry applications that demand both high accuracy and high numerical efficiency.

Phys Rev E ; 107(5-2): 055301, 2023 May.
Article in English | MEDLINE | ID: mdl-37329105


We consider a class of Hubbard-Stratonovich transformations suitable for treating Hubbard interactions in the context of quantum Monte Carlo simulations. A tunable parameter p allows us to continuously vary from a discrete Ising auxiliary field (p=∞) to a compact auxiliary field that couples to electrons sinusoidally (p=0). In tests on the single-band square and triangular Hubbard models, we find that the severity of the sign problem decreases systematically with increasing p. Selecting p finite, however, enables continuous sampling methods such as the Langevin or Hamiltonian Monte Carlo methods. We explore the tradeoffs between various simulation methods through numerical benchmarks.

Electrons , Computer Simulation , Monte Carlo Method
Nat Commun ; 14(1): 3626, 2023 Jun 19.
Article in English | MEDLINE | ID: mdl-37336881


Magnetic skyrmions are nanoscale topological textures that have been recently observed in different families of quantum magnets. These objects are called CP1 skyrmions because they are built from dipoles-the target manifold is the 1D complex projective space, CP1 ≅ S2. Here we report the emergence of magnetic CP2 skyrmions in a realistic spin-1 model, which includes both dipole and quadrupole moments. Unlike CP1 skyrmions, CP2 skyrmions can also arise as metastable textures of quantum paramagnets, opening a new road to discover emergent topological solitons in non-magnetic materials. The quantum phase diagram of the spin-1 model also includes magnetic field-induced CP2 skyrmion crystals that can be detected with regular momentum- (diffraction) and real-space (Lorentz transmission electron microscopy) experimental techniques.

J Chem Theory Comput ; 19(11): 3209-3222, 2023 Jun 13.
Article in English | MEDLINE | ID: mdl-37163680


Extended Lagrangian Born-Oppenheimer molecular dynamics (XL-BOMD) in its most recent shadow potential energy version has been implemented in the semiempirical PyTorch-based software PySeQM. The implementation includes finite electronic temperatures, canonical density matrix perturbation theory, and an adaptive Krylov subspace approximation for the integration of the electronic equations of motion within the XL-BOMB approach (KSA-XL-BOMD). The PyTorch implementation leverages the use of GPU and machine learning hardware accelerators for the simulations. The new XL-BOMD formulation allows studying more challenging chemical systems with charge instabilities and low electronic energy gaps. The current public release of PySeQM continues our development of modular architecture for large-scale simulations employing semi-empirical quantum-mechanical treatment. Applied to molecular dynamics, simulation of 840 carbon atoms, one integration time step executes in 4 s on a single Nvidia RTX A6000 GPU.

J Chem Phys ; 158(18)2023 May 14.
Article in English | MEDLINE | ID: mdl-37158328


Atomistic machine learning focuses on the creation of models that obey fundamental symmetries of atomistic configurations, such as permutation, translation, and rotation invariances. In many of these schemes, translation and rotation invariance are achieved by building on scalar invariants, e.g., distances between atom pairs. There is growing interest in molecular representations that work internally with higher rank rotational tensors, e.g., vector displacements between atoms, and tensor products thereof. Here, we present a framework for extending the Hierarchically Interacting Particle Neural Network (HIP-NN) with Tensor Sensitivity information (HIP-NN-TS) from each local atomic environment. Crucially, the method employs a weight tying strategy that allows direct incorporation of many-body information while adding very few model parameters. We show that HIP-NN-TS is more accurate than HIP-NN, with negligible increase in parameter count, for several datasets and network sizes. As the dataset becomes more complex, tensor sensitivities provide greater improvements to model accuracy. In particular, HIP-NN-TS achieves a record mean absolute error of 0.927 kcalmol for conformational energy variation on the challenging COMP6 benchmark, which includes a broad set of organic molecules. We also compare the computational performance of HIP-NN-TS to HIP-NN and other models in the literature.

Nat Comput Sci ; 3(3): 230-239, 2023 Mar.
Article in English | MEDLINE | ID: mdl-38177878


Machine learning (ML) models, if trained to data sets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse data sets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configuration. If the uncertainty estimate passes a certain threshold, then the configuration is included in the data set. Here we develop a strategy to more rapidly discover configurations that meaningfully augment the training data set. The approach, uncertainty-driven dynamics for active learning (UDD-AL), modifies the potential energy surface used in molecular dynamics simulations to favor regions of configuration space for which there is large model uncertainty. The performance of UDD-AL is demonstrated for two AL tasks: sampling the conformational space of glycine and sampling the promotion of proton transfer in acetylacetone. The method is shown to efficiently explore the chemically relevant configuration space, which may be inaccessible using regular dynamical sampling at target temperature conditions.

Fabaceae , Uncertainty , Glycine , Machine Learning , Molecular Dynamics Simulation
Proc Natl Acad Sci U S A ; 119(27): e2120333119, 2022 Jul 05.
Article in English | MEDLINE | ID: mdl-35776544


Conventional machine-learning (ML) models in computational chemistry learn to directly predict molecular properties using quantum chemistry only for reference data. While these heuristic ML methods show quantum-level accuracy with speeds several orders of magnitude faster than traditional quantum chemistry methods, they suffer from poor extensibility and transferability; i.e., their accuracy degrades on large or new chemical systems. Incorporating quantum chemistry frameworks into the ML models directly solves this problem. Here we take the structure of semiempirical quantum mechanics (SEQM) methods to construct dynamically responsive Hamiltonians. SEQM methods use empirical parameters fitted to experimental properties to construct reduced-order Hamiltonians, facilitating much faster calculations than ab initio methods but with compromised accuracy. By replacing these static parameters with machine-learned dynamic values inferred from the local environment, we greatly improve the accuracy of the SEQM methods. Trained on molecular energies and atomic forces, these dynamically generated Hamiltonian parameters show a strong correlation with atomic hybridization and bonding. Trained with only about 60,000 small organic molecular conformers, the resulting model retains interpretability, extensibility, and transferability when testing on much larger chemical systems and predicting various molecular properties. Overall, this work demonstrates the virtues of incorporating physics-based descriptions with ML to develop models that are simultaneously accurate, transferable, and interpretable.

Phys Rev E ; 105(6-2): 065302, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35854479


We introduce methodologies for highly scalable quantum Monte Carlo simulations of electron-phonon models, and we report benchmark results for the Holstein model on the square lattice. The determinant quantum Monte Carlo (DQMC) method is a widely used tool for simulating simple electron-phonon models at finite temperatures, but it incurs a computational cost that scales cubically with system size. Alternatively, near-linear scaling with system size can be achieved with the hybrid Monte Carlo (HMC) method and an integral representation of the Fermion determinant. Here, we introduce a collection of methodologies that make such simulations even faster. To combat "stiffness" arising from the bosonic action, we review how Fourier acceleration can be combined with time-step splitting. To overcome phonon sampling barriers associated with strongly bound bipolaron formation, we design global Monte Carlo updates that approximately respect particle-hole symmetry. To accelerate the iterative linear solver, we introduce a preconditioner that becomes exact in the adiabatic limit of infinite atomic mass. Finally, we demonstrate how stochastic measurements can be accelerated using fast Fourier transforms. These methods are all complementary and, combined, may produce multiple orders of magnitude speedup, depending on model details.

Phys Rev E ; 105(4-2): 045311, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35590547


We present a method to facilitate Monte Carlo simulations in the grand canonical ensemble given a target mean particle number. The method imposes a fictitious dynamics on the chemical potential, to be run concurrently with the Monte Carlo sampling of the physical system. Corrections to the chemical potential are made according to time-averaged estimates of the mean and variance of the particle number, with the latter being proportional to thermodynamic compressibility. We perform a variety of tests, and in all cases find rapid convergence of the chemical potential-inexactness of the tuning algorithm contributes only a minor part of the total measurement error for realistic simulations.

Phys Rev E ; 105(4-2): 045301, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35590626


We propose a data-driven method to describe consistent equations of state (EOS) for arbitrary systems. Complex EOS are traditionally obtained by fitting suitable analytical expressions to thermophysical data. A key aspect of EOS is that the relationships between state variables are given by derivatives of the system free energy. In this work, we model the free energy with an artificial neural network and utilize automatic differentiation to directly learn the derivatives of the free energy. We demonstrate this approach on two different systems, the analytic van der Waals EOS and published data for the Lennard-Jones fluid, and we show that it is advantageous over direct learning of thermodynamic properties (i.e., not as derivatives of the free energy but as independent properties), in terms of both accuracy and the exact preservation of the Maxwell relations. Furthermore, the method implicitly provides the free energy of a system without explicit integration.

Nat Rev Chem ; 6(9): 653-672, 2022 Sep.
Article in English | MEDLINE | ID: mdl-37117713


Machine learning (ML) is becoming a method of choice for modelling complex chemical processes and materials. ML provides a surrogate model trained on a reference dataset that can be used to establish a relationship between a molecular structure and its chemical properties. This Review highlights developments in the use of ML to evaluate chemical properties such as partial atomic charges, dipole moments, spin and electron densities, and chemical bonding, as well as to obtain a reduced quantum-mechanical description. We overview several modern neural network architectures, their predictive capabilities, generality and transferability, and illustrate their applicability to various chemical properties. We emphasize that learned molecular representations resemble quantum-mechanical analogues, demonstrating the ability of the models to capture the underlying physics. We also discuss how ML models can describe non-local quantum effects. Finally, we conclude by compiling a list of available ML toolboxes, summarizing the unresolved challenges and presenting an outlook for future development. The observed trends demonstrate that this field is evolving towards physics-based models augmented by ML, which is accompanied by the development of new methods and the rapid growth of user-friendly ML frameworks for chemistry.

J Chem Theory Comput ; 17(10): 6180-6192, 2021 Oct 12.
Article in English | MEDLINE | ID: mdl-34595916


Tensor cores, along with tensor processing units, represent a new form of hardware acceleration specifically designed for deep neural network calculations in artificial intelligence applications. Tensor cores provide extraordinary computational speed and energy efficiency but with the caveat that they were designed for tensor contractions (matrix-matrix multiplications) using only low-precision floating-point operations. Despite this perceived limitation, we demonstrate how tensor cores can be applied with high efficiency to the challenging and numerically sensitive problem of quantum-based Born-Oppenheimer molecular dynamics, which requires highly accurate electronic structure optimizations and conservative force evaluations. The interatomic forces are calculated on-the-fly from an electronic structure that is obtained from a generalized deep neural network, where the computational structure naturally takes advantage of the exceptional processing power of the tensor cores and allows for high performance in excess of 100 Tflops on a single Nvidia A100 GPU. Stable molecular dynamics trajectories are generated using the framework of extended Lagrangian Born-Oppenheimer molecular dynamics, which combines computational efficiency with long-term stability, even when using approximate charge relaxations and force evaluations that are limited in accuracy by the numerically noisy conditions caused by the low-precision tensor core floating-point operations. A canonical ensemble simulation scheme is also presented, where the additional numerical noise in the calculated forces is absorbed into a Langevin-like dynamics.

Chem Sci ; 12(30): 10207-10217, 2021 Aug 04.
Article in English | MEDLINE | ID: mdl-34447529


Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling potential energy surfaces inaccurately predict singlet-triplet energy gaps due to the failure to account for spatial localities of spin transitions. To solve this, we introduce localization layers in a neural network model that weight atomic contributions to the energy gap, thereby allowing the model to isolate the most determinative chemical environments. Trained on the singlet-triplet energy gaps of organic molecules, we apply our method to an out-of-sample test set of large phosphorescent compounds and demonstrate the substantial improvement that localization layers have on predicting their phosphorescence energies. Remarkably, the inferred localization weights have a strong relationship with the ab initio spin density of the singlet-triplet transition, and thus infer localities of the molecule that determine the spin transition, despite the fact that no direct electronic information was provided during training. The use of localization layers is expected to improve the modeling of many localized, non-extensive phenomena and could be implemented in any atom-centered neural network model.

J Chem Phys ; 154(24): 244108, 2021 Jun 28.
Article in English | MEDLINE | ID: mdl-34241371


The Hückel Hamiltonian is an incredibly simple tight-binding model known for its ability to capture qualitative physics phenomena arising from electron interactions in molecules and materials. Part of its simplicity arises from using only two types of empirically fit physics-motivated parameters: the first describes the orbital energies on each atom and the second describes electronic interactions and bonding between atoms. By replacing these empirical parameters with machine-learned dynamic values, we vastly increase the accuracy of the extended Hückel model. The dynamic values are generated with a deep neural network, which is trained to reproduce orbital energies and densities derived from density functional theory. The resulting model retains interpretability, while the deep neural network parameterization is smooth and accurate and reproduces insightful features of the original empirical parameterization. Overall, this work shows the promise of utilizing machine learning to formulate simple, accurate, and dynamically parameterized physics models.

J Phys Chem Lett ; 12(26): 6227-6243, 2021 Jul 08.
Article in English | MEDLINE | ID: mdl-34196559


Machine learning (ML) is quickly becoming a premier tool for modeling chemical processes and materials. ML-based force fields, trained on large data sets of high-quality electron structure calculations, are particularly attractive due their unique combination of computational efficiency and physical accuracy. This Perspective summarizes some recent advances in the development of neural network-based interatomic potentials. Designing high-quality training data sets is crucial to overall model accuracy. One strategy is active learning, in which new data are automatically collected for atomic configurations that produce large ML uncertainties. Another strategy is to use the highest levels of quantum theory possible. Transfer learning allows training to a data set of mixed fidelity. A model initially trained to a large data set of density functional theory calculations can be significantly improved by retraining to a relatively small data set of expensive coupled cluster theory calculations. These advances are exemplified by applications to molecules and materials.

J Chem Theory Comput ; 17(4): 2256-2265, 2021 Apr 13.
Article in English | MEDLINE | ID: mdl-33797253


We present a second-order recursive Fermi-operator expansion scheme using mixed precision floating point operations to perform electronic structure calculations using tensor core units. A performance of over 100 teraFLOPs is achieved for half-precision floating point operations on Nvidia's A100 tensor core units. The second-order recursive Fermi-operator scheme is formulated in terms of a generalized, differentiable deep neural network structure, which solves the quantum mechanical electronic structure problem. We demonstrate how this network can be accelerated by optimizing the weight and bias values to substantially reduce the number of layers required for convergence. We also show how this machine learning approach can be used to optimize the coefficients of the recursive Fermi-operator expansion to accurately represent the fractional occupation numbers of the electronic states at finite temperatures.