Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 132
Filtrar
1.
Nature ; 589(7840): 59-64, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33408379

RESUMO

Structurally disordered materials pose fundamental questions1-4, including how different disordered phases ('polyamorphs') can coexist and transform from one phase to another5-9. Amorphous silicon has been extensively studied; it forms a fourfold-coordinated, covalent network at ambient conditions and much-higher-coordinated, metallic phases under pressure10-12. However, a detailed mechanistic understanding of the structural transitions in disordered silicon has been lacking, owing to the intrinsic limitations of even the most advanced experimental and computational techniques, for example, in terms of the system sizes accessible via simulation. Here we show how atomistic machine learning models trained on accurate quantum mechanical computations can help to describe liquid-amorphous and amorphous-amorphous transitions for a system of 100,000 atoms (ten-nanometre length scale), predicting structure, stability and electronic properties. Our simulations reveal a three-step transformation sequence for amorphous silicon under increasing external pressure. First, polyamorphic low- and high-density amorphous regions are found to coexist, rather than appearing sequentially. Then, we observe a structural collapse into a distinct very-high-density amorphous (VHDA) phase. Finally, our simulations indicate the transient nature of this VHDA phase: it rapidly nucleates crystallites, ultimately leading to the formation of a polycrystalline structure, consistent with experiments13-15 but not seen in earlier simulations11,16-18. A machine learning model for the electronic density of states confirms the onset of metallicity during VHDA formation and the subsequent crystallization. These results shed light on the liquid and amorphous states of silicon, and, in a wider context, they exemplify a machine learning-driven approach to predictive materials modelling.

2.
Nature ; 585(7824): 217-220, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32908269

RESUMO

Hydrogen, the simplest and most abundant element in the Universe, develops a remarkably complex behaviour upon compression1. Since Wigner predicted the dissociation and metallization of solid hydrogen at megabar pressures almost a century ago2, several efforts have been made to explain the many unusual properties of dense hydrogen, including a rich and poorly understood solid polymorphism1,3-5, an anomalous melting line6 and the possible transition to a superconducting state7. Experiments at such extreme conditions are challenging and often lead to hard-to-interpret and controversial observations, whereas theoretical investigations are constrained by the huge computational cost of sufficiently accurate quantum mechanical calculations. Here we present a theoretical study of the phase diagram of dense hydrogen that uses machine learning to 'learn' potential-energy surfaces and interatomic forces from reference calculations and then predict them at low computational cost, overcoming length- and timescale limitations. We reproduce both the re-entrant melting behaviour and the polymorphism of the solid phase. Simulations using our machine-learning-based potentials provide evidence for a continuous molecular-to-atomic transition in the liquid, with no first-order transition observed above the melting line. This suggests a smooth transition between insulating and metallic layers in giant gas planets, and reconciles existing discrepancies between experiments as a manifestation of supercritical behaviour.

3.
J Chem Phys ; 161(4)2024 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-39056390

RESUMO

Machine-learning models based on a point-cloud representation of a physical object are ubiquitous in scientific applications and particularly well-suited to the atomic-scale description of molecules and materials. Among the many different approaches that have been pursued, the description of local atomic environments in terms of their discretized neighbor densities has been used widely and very successfully. We propose a novel density-based method, which involves computing "Wigner kernels." These are fully equivariant and body-ordered kernels that can be computed iteratively at a cost that is independent of the basis used to discretize the density and grows only linearly with the maximum body-order considered. Wigner kernels represent the infinite-width limit of feature-space models, whose dimensionality and computational cost instead scale exponentially with the increasing order of correlations. We present several examples of the accuracy of models based on Wigner kernels in chemical applications, for both scalar and tensorial targets, reaching an accuracy that is competitive with state-of-the-art deep-learning architectures. We discuss the broader relevance of these findings to equivariant geometric machine-learning.

4.
Chem Rev ; 121(16): 9759-9815, 2021 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-34310133

RESUMO

The first step in the construction of a regression model or a data-driven analysis, aiming to predict or elucidate the relationship between the atomic-scale structure of matter and its properties, involves transforming the Cartesian coordinates of the atoms into a suitable representation. The development of atomic-scale representations has played, and continues to play, a central role in the success of machine-learning methods for chemistry and materials science. This review summarizes the current understanding of the nature and characteristics of the most commonly used structural and chemical descriptions of atomistic structures, highlighting the deep underlying connections between different frameworks and the ideas that lead to computationally efficient and universally applicable models. It emphasizes the link between properties, structures, their physical chemistry, and their mathematical description, provides examples of recent applications to a diverse set of chemical and materials science problems, and outlines the open questions and the most promising research directions in the field.

5.
Chem Rev ; 121(16): 10073-10141, 2021 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-34398616

RESUMO

We provide an introduction to Gaussian process regression (GPR) machine-learning methods in computational materials science and chemistry. The focus of the present review is on the regression of atomistic properties: in particular, on the construction of interatomic potentials, or force fields, in the Gaussian Approximation Potential (GAP) framework; beyond this, we also discuss the fitting of arbitrary scalar, vectorial, and tensorial quantities. Methodological aspects of reference data generation, representation, and regression, as well as the question of how a data-driven model may be validated, are reviewed and critically discussed. A survey of applications to a variety of research questions in chemistry and materials science illustrates the rapid growth in the field. A vision is outlined for the development of the methodology in the years to come.

6.
J Chem Phys ; 159(6)2023 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-37551818

RESUMO

Spherical harmonics provide a smooth, orthogonal, and symmetry-adapted basis to expand functions on a sphere, and they are used routinely in physical and theoretical chemistry as well as in different fields of science and technology, from geology and atmospheric sciences to signal processing and computer graphics. More recently, they have become a key component of rotationally equivariant models in geometric machine learning, including applications to atomic-scale modeling of molecules and materials. We present an elegant and efficient algorithm for the evaluation of the real-valued spherical harmonics. Our construction features many of the desirable properties of existing schemes and allows us to compute Cartesian derivatives in a numerically stable and computationally efficient manner. To facilitate usage, we implement this algorithm in sphericart, a fast C++ library that also provides C bindings, a Python API, and a PyTorch implementation that includes a GPU kernel.

7.
Nat Mater ; 20(3): 362-369, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33020610

RESUMO

The synthesis of molecular-sieving zeolitic membranes by the assembly of building blocks, avoiding the hydrothermal treatment, is highly desired to improve reproducibility and scalability. Here we report exfoliation of the sodalite precursor RUB-15 into crystalline 0.8-nm-thick nanosheets, that host hydrogen-sieving six-membered rings (6-MRs) of SiO4 tetrahedra. Thin films, fabricated by the filtration of a suspension of exfoliated nanosheets, possess two transport pathways: 6-MR apertures and intersheet gaps. The latter were found to dominate the gas transport and yielded a molecular cutoff of 3.6 Å with a H2/N2 selectivity above 20. The gaps were successfully removed by the condensation of the terminal silanol groups of RUB-15 to yield H2/CO2 selectivities up to 100. The high selectivity was exclusively from the transport across 6-MR, which was confirmed by a good agreement between the experimentally determined apparent activation energy of H2 and that computed by ab initio calculations. The scalable fabrication and the attractive sieving performance at 250-300 °C make these membranes promising for precombustion carbon capture.

9.
J Chem Phys ; 156(20): 204115, 2022 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-35649823

RESUMO

Data-driven schemes that associate molecular and crystal structures with their microscopic properties share the need for a concise, effective description of the arrangement of their atomic constituents. Many types of models rely on descriptions of atom-centered environments, which are associated with an atomic property or with an atomic contribution to an extensive macroscopic quantity. Frameworks in this class can be understood in terms of atom-centered density correlations (ACDC), which are used as a basis for a body-ordered, symmetry-adapted expansion of the targets. Several other schemes that gather information on the relationship between neighboring atoms using "message-passing" ideas cannot be directly mapped to correlations centered around a single atom. We generalize the ACDC framework to include multi-centered information, generating representations that provide a complete linear basis to regress symmetric functions of atomic coordinates, and provide a coherent foundation to systematize our understanding of both atom-centered and message-passing and invariant and equivariant machine-learning schemes.

10.
J Chem Phys ; 156(1): 014115, 2022 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-34998321

RESUMO

Symmetry considerations are at the core of the major frameworks used to provide an effective mathematical representation of atomic configurations that is then used in machine-learning models to predict the properties associated with each structure. In most cases, the models rely on a description of atom-centered environments and are suitable to learn atomic properties or global observables that can be decomposed into atomic contributions. Many quantities that are relevant for quantum mechanical calculations, however-most notably the single-particle Hamiltonian matrix when written in an atomic orbital basis-are not associated with a single center, but with two (or more) atoms in the structure. We discuss a family of structural descriptors that generalize the very successful atom-centered density correlation features to the N-center case and show, in particular, how this construction can be applied to efficiently learn the matrix elements of the (effective) single-particle Hamiltonian written in an atom-centered orbital basis. These N-center features are fully equivariant-not only in terms of translations and rotations but also in terms of permutations of the indices associated with the atoms-and are suitable to construct symmetry-adapted machine-learning models of new classes of properties of molecules and materials.

11.
J Chem Phys ; 157(23): 234101, 2022 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-36550032

RESUMO

Machine learning frameworks based on correlations of interatomic positions begin with a discretized description of the density of other atoms in the neighborhood of each atom in the system. Symmetry considerations support the use of spherical harmonics to expand the angular dependence of this density, but there is, as of yet, no clear rationale to choose one radial basis over another. Here, we investigate the basis that results from the solution of the Laplacian eigenvalue problem within a sphere around the atom of interest. We show that this generates a basis of controllable smoothness within the sphere (in the same sense as plane waves provide a basis with controllable smoothness for a problem with periodic boundaries) and that a tensor product of Laplacian eigenstates also provides a smooth basis for expanding any higher-order correlation of the atomic density within the appropriate hypersphere. We consider several unsupervised metrics of the quality of a basis for a given dataset and show that the Laplacian eigenstate basis has a performance that is much better than some widely used basis sets and competitive with data-driven bases that numerically optimize each metric. Finally, we investigate the role of the basis in building models of the potential energy. In these tests, we find that a combination of the Laplacian eigenstate basis and target-oriented heuristics leads to equal or improved regression performance when compared to both heuristic and data-driven bases in the literature. We conclude that the smoothness of the basis functions is a key aspect of successful atomic density representations.

12.
J Chem Phys ; 157(17): 177101, 2022 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-36347686

RESUMO

The "quasi-constant" smooth overlap of atomic position and atom-centered symmetry function fingerprint manifolds recently discovered by Parsaeifard and Goedecker [J. Chem. Phys. 156, 034302 (2022)] are closely related to the degenerate pairs of configurations, which are known shortcomings of all low-body-order atom-density correlation representations of molecular structures. Configurations that are rigorously singular-which we demonstrate can only occur in finite, discrete sets and not as a continuous manifold-determine the complete failure of machine-learning models built on this class of descriptors. The "quasi-constant" manifolds, on the other hand, exhibit low but non-zero sensitivity to atomic displacements. As a consequence, for any such manifold, it is possible to optimize model parameters and the training set to mitigate their impact on learning even though this is often impractical and it is preferable to use descriptors that avoid both exact singularities and the associated numerical instability.

13.
Proc Natl Acad Sci U S A ; 116(4): 1110-1115, 2019 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-30610171

RESUMO

Thermodynamic properties of liquid water as well as hexagonal (Ih) and cubic (Ic) ice are predicted based on density functional theory at the hybrid-functional level, rigorously taking into account quantum nuclear motion, anharmonic fluctuations, and proton disorder. This is made possible by combining advanced free-energy methods and state-of-the-art machine-learning techniques. The ab initio description leads to structural properties in excellent agreement with experiments and reliable estimates of the melting points of light and heavy water. We observe that nuclear-quantum effects contribute a crucial [Formula: see text] to the stability of ice Ih, making it more stable than ice Ic. Our computational approach is general and transferable, providing a comprehensive framework for quantitative predictions of ab initio thermodynamic properties using machine-learning potentials as an intermediate step.

14.
Proc Natl Acad Sci U S A ; 116(9): 3401-3406, 2019 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-30733292

RESUMO

The molecular dipole polarizability describes the tendency of a molecule to change its dipole moment in response to an applied electric field. This quantity governs key intra- and intermolecular interactions, such as induction and dispersion; plays a vital role in determining the spectroscopic signatures of molecules; and is an essential ingredient in polarizable force fields. Compared with other ground-state properties, an accurate prediction of the molecular polarizability is considerably more difficult, as this response quantity is quite sensitive to the underlying electronic structure description. In this work, we present highly accurate quantum mechanical calculations of the static dipole polarizability tensors of 7,211 small organic molecules computed using linear response coupled cluster singles and doubles theory (LR-CCSD). Using a symmetry-adapted machine-learning approach, we demonstrate that it is possible to predict the LR-CCSD molecular polarizabilities of these small molecules with an error that is an order of magnitude smaller than that of hybrid density functional theory (DFT) at a negligible computational cost. The resultant model is robust and transferable, yielding molecular polarizabilities for a diverse set of 52 larger molecules (including challenging conjugated systems, carbohydrates, small drugs, amino acids, nucleobases, and hydrocarbon isomers) at an accuracy that exceeds that of hybrid DFT. The atom-centered decomposition implicit in our machine-learning approach offers some insight into the shortcomings of DFT in the prediction of this fundamental quantity of interest.

15.
Proc Natl Acad Sci U S A ; 116(51): 25516-25523, 2019 12 17.
Artigo em Inglês | MEDLINE | ID: mdl-31792179

RESUMO

The interface between water and folded proteins is very complex. Proteins have "patchy" solvent-accessible areas composed of domains of varying hydrophobicity. The textbook understanding is that these domains contribute additively to interfacial properties (Cassie's equation, CE). An ever-growing number of modeling papers question the validity of CE at molecular length scales, but there is no conclusive experiment to support this and no proposed new theoretical framework. Here, we study the wetting of model compounds with patchy surfaces differing solely in patchiness but not in composition. Were CE to be correct, these materials would have had the same solid-liquid work of adhesion (WSL ) and time-averaged structure of interfacial water. We find considerable differences in WSL , and sum-frequency generation measurements of the interfacial water structure show distinctively different spectral features. Molecular-dynamics simulations of water on patchy surfaces capture the observed behaviors and point toward significant nonadditivity in water density and average orientation. They show that a description of the molecular arrangement on the surface is needed to predict its wetting properties. We propose a predictive model that considers, for every molecule, the contributions of its first-nearest neighbors as a descriptor to determine the wetting properties of the surface. The model is validated by measurements of WSL in multiple solvents, where large differences are observed for solvents whose effective diameter is smaller than ∼6 Å. The experiments and theoretical model proposed here provide a starting point to develop a comprehensive understanding of complex biological interfaces as well as for the engineering of synthetic ones.

16.
J Chem Phys ; 154(16): 160401, 2021 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-33940847

RESUMO

Over recent years, the use of statistical learning techniques applied to chemical problems has gained substantial momentum. This is particularly apparent in the realm of physical chemistry, where the balance between empiricism and physics-based theory has traditionally been rather in favor of the latter. In this guest Editorial for the special topic issue on "Machine Learning Meets Chemical Physics," a brief rationale is provided, followed by an overview of the topics covered. We conclude by making some general remarks.

17.
J Chem Phys ; 155(10): 104106, 2021 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-34525832

RESUMO

The input of almost every machine learning algorithm targeting the properties of matter at the atomic scale involves a transformation of the list of Cartesian atomic coordinates into a more symmetric representation. Many of the most popular representations can be seen as an expansion of the symmetrized correlations of the atom density and differ mainly by the choice of basis. Considerable effort has been dedicated to the optimization of the basis set, typically driven by heuristic considerations on the behavior of the regression target. Here, we take a different, unsupervised viewpoint, aiming to determine the basis that encodes in the most compact way possible the structural information that is relevant for the dataset at hand. For each training dataset and number of basis functions, one can build a unique basis that is optimal in this sense and can be computed at no additional cost with respect to the primitive basis by approximating it with splines. We demonstrate that this construction yields representations that are accurate and computationally efficient, particularly when working with representations that correspond to high-body order correlations. We present examples that involve both molecular and condensed-phase machine-learning models.

18.
J Chem Phys ; 154(7): 074102, 2021 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-33607885

RESUMO

Machine-learning models have emerged as a very effective strategy to sidestep time-consuming electronic-structure calculations, enabling accurate simulations of greater size, time scale, and complexity. Given the interpolative nature of these models, the reliability of predictions depends on the position in phase space, and it is crucial to obtain an estimate of the error that derives from the finite number of reference structures included during model training. When using a machine-learning potential to sample a finite-temperature ensemble, the uncertainty on individual configurations translates into an error on thermodynamic averages and leads to a loss of accuracy when the simulation enters a previously unexplored region. Here, we discuss how uncertainty quantification can be used, together with a baseline energy model, or a more robust but less accurate interatomic potential, to obtain more resilient simulations and to support active-learning strategies. Furthermore, we introduce an on-the-fly reweighing scheme that makes it possible to estimate the uncertainty in thermodynamic averages extracted from long trajectories. We present examples covering different types of structural and thermodynamic properties and systems as diverse as water and liquid gallium.

19.
J Chem Phys ; 154(11): 114109, 2021 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-33752353

RESUMO

Physically motivated and mathematically robust atom-centered representations of molecular structures are key to the success of modern atomistic machine learning. They lie at the foundation of a wide range of methods to predict the properties of both materials and molecules and to explore and visualize their chemical structures and compositions. Recently, it has become clear that many of the most effective representations share a fundamental formal connection. They can all be expressed as a discretization of n-body correlation functions of the local atom density, suggesting the opportunity of standardizing and, more importantly, optimizing their evaluation. We present an implementation, named librascal, whose modular design lends itself both to developing refinements to the density-based formalism and to rapid prototyping for new developments of rotationally equivariant atomistic representations. As an example, we discuss smooth overlap of atomic position (SOAP) features, perhaps the most widely used member of this family of representations, to show how the expansion of the local density can be optimized for any choice of radial basis sets. We discuss the representation in the context of a kernel ridge regression model, commonly used with SOAP features, and analyze how the computational effort scales for each of the individual steps of the calculation. By applying data reduction techniques in feature space, we show how to reduce the total computational cost by a factor of up to 4 without affecting the model's symmetry properties and without significantly impacting its accuracy.

20.
Phys Rev Lett ; 125(16): 166001, 2020 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-33124874

RESUMO

Many-body descriptors are widely used to represent atomic environments in the construction of machine-learned interatomic potentials and more broadly for fitting, classification, and embedding tasks on atomic structures. There is a widespread belief in the community that three-body correlations are likely to provide an overcomplete description of the environment of an atom. We produce several counterexamples to this belief, with the consequence that any classifier, regression, or embedding model for atom-centered properties that uses three- (or four)-body features will incorrectly give identical results for different configurations. Writing global properties (such as total energies) as a sum of many atom-centered contributions mitigates the impact of this fundamental deficiency-explaining the success of current "machine-learning" force fields. We anticipate the issues that will arise as the desired accuracy increases, and suggest potential solutions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA