RESUMO
There is a growing desire for inter-package modularity within the chemistry software community to reuse encapsulated code units across a variety of software packages. Most comprehensive efforts at achieving inter-package modularity will quickly run afoul of a very practical problem, being able to cohesively build the modules. Writing and maintaining build systems has long been an issue for many scientific software packages that rely on compiled languages such as C/C++. The push for inter-package modularity compounds this issue by additionally requiring binary artifacts from disparate developers to interoperate at a binary level. Thankfully, the de facto build tool for C/C++, CMake, is more than capable of supporting the myriad of edge cases that complicate writing robust build systems. Unfortunately, writing and maintaining a robust CMake build system can be a laborious endeavor because CMake provides few abstractions to aid the developer. The need to significantly simplify the process of writing robust CMake-based build systems, especially in inter-package builds, motivated us to write CMaize. In addition to describing the architecture and design of CMaize, the article also demonstrates how CMaize is used in production-level software.
RESUMO
The power of quantum chemistry to predict the ground and excited state properties of complex chemical systems has driven the development of computational quantum chemistry software, integrating advances in theory, applied mathematics, and computer science. The emergence of new computational paradigms associated with exascale technologies also poses significant challenges that require a flexible forward strategy to take full advantage of existing and forthcoming computational resources. In this context, the sustainability and interoperability of computational chemistry software development are among the most pressing issues. In this perspective, we discuss software infrastructure needs and investments with an eye to fully utilize exascale resources and provide unique computational tools for next-generation science problems and scientific discoveries.
RESUMO
For many computational chemistry packages, being able to efficiently and effectively scale across an exascale cluster is a heroic feat. Collective experience from the Department of Energy's Exascale Computing Project suggests that achieving exascale performance requires far more planning, design, and optimization than scaling to petascale. In many cases, entire rewrites of software are necessary to address fundamental algorithmic bottlenecks. This in turn requires a tremendous amount of resources and development time, resources that cannot reasonably be afforded by every computational science project. It thus becomes imperative that computational science transition to a more sustainable paradigm. Key to such a paradigm is modular software. While the importance of modular software is widely recognized, what is perhaps not so widely appreciated is the effort still required to leverage modular software in a sustainable manner. The present manuscript introduces PluginPlay, https://github.com/NWChemEx-Project/PluginPlay, an inversion-of-control framework designed to facilitate developing, maintaining, and sustaining modular scientific software packages. This manuscript focuses on the design aspects of PluginPlay and how they specifically influence the performance of the resulting package. Although, PluginPlay serves as the framework for the NWChemEx package, PluginPlay is not tied to NWChemEx or even computational chemistry. We thus anticipate PluginPlay to prove to be a generally useful tool for a number of computational science packages looking to transition to the exascale.
RESUMO
Since the advent of the first computers, chemists have been at the forefront of using computers to understand and solve complex chemical problems. As the hardware and software have evolved, so have the theoretical and computational chemistry methods and algorithms. Parallel computers clearly changed the common computing paradigm in the late 1970s and 80s, and the field has again seen a paradigm shift with the advent of graphical processing units. This review explores the challenges and some of the solutions in transforming software from the terascale to the petascale and now to the upcoming exascale computers. While discussing the field in general, NWChem and its redesign, NWChemEx, will be highlighted as one of the early codesign projects to take advantage of massively parallel computers and emerging software standards to enable large scientific challenges to be tackled.
RESUMO
Fragment-based methods promise accurate energetics at a cost that scales linearly with the number of fragments. This promise is founded on the premise that the many-body expansion (or another similar energy decomposition) needs to only consider spatially local many-body interactions. Experience and chemical intuition suggest that typically at most four-body interactions are required for high accuracy. Bettens and co-workers [ J. Chem. Theory Comput. 2014 9, 3699-3707] published a detailed study showing that for moderately sized water clusters, basis set superposition error (BSSE) undermines this premise. Ultimately, they were able to overcome BSSE by performing all computations in the supersystem basis set, but such a solution destroys the reduced computational scaling of fragment-based methods. Their findings led them to suggest that there is "trouble with the many-body expansion". Since then, a subsequent follow-up study from Bettens and co-workers [ J. Chem. THEORY: Comput. 2015, 11, 5132-5143] as well as a related study by Mayer and Bakó [ J. Chem. Theory Comput. 2017, 13, 1883-1886] have proposed new frameworks for understanding BSSE in the many-body expansion. Although the two frameworks ultimately propose the same working set of equations to the BSSE problem, their interpretations are quite different, even disagreeing on whether or not the solution is an approximation. In this work we propose a more general BSSE framework. We then show that, somewhat paradoxically, the two interpretations are compatible and amount to two different "normalization" conditions. Finally, we consider applications of these BSSE frameworks to small water clusters, where we focus on replicating high-accuracy coupled cluster benchmarks. Ultimately, we show for water clusters, using the present framework, that one can obtain results that are within ±0.5 kcal mol-1 of the coupled cluster complete basis set limit without considering anymore than a correlated three-body computation in a quadruple-ζ basis set and a four-body triple-ζ Hartree-Fock computation.
RESUMO
Psi4 is an ab initio electronic structure program providing methods such as Hartree-Fock, density functional theory, configuration interaction, and coupled-cluster theory. The 1.1 release represents a major update meant to automate complex tasks, such as geometry optimization using complete-basis-set extrapolation or focal-point methods. Conversion of the top-level code to a Python module means that Psi4 can now be used in complex workflows alongside other Python tools. Several new features have been added with the aid of libraries providing easy access to techniques such as density fitting, Cholesky decomposition, and Laplace denominators. The build system has been completely rewritten to simplify interoperability with independent, reusable software components for quantum chemistry. Finally, a wide range of new theoretical methods and analyses have been added to the code base, including functional-group and open-shell symmetry adapted perturbation theory, density-fitted coupled cluster with frozen natural orbitals, orbital-optimized perturbation and coupled-cluster methods (e.g., OO-MP2 and OO-LCCD), density-fitted multiconfigurational self-consistent field, density cumulant functional theory, algebraic-diagrammatic construction excited states, improvements to the geometry optimizer, and the "X2C" approach to relativistic corrections, among many other improvements.
RESUMO
To complement our study of the role of finite precision in electronic structure calculations based on a truncated many-body expansion (MBE, or "n-body expansion"), we examine the accuracy of such methods in the present work. Accuracy may be defined either with respect to a supersystem calculation computed at the same level of theory as the n-body calculations, or alternatively with respect to high-quality benchmarks. Both metrics are considered here. In applications to a sequence of water clusters, (H2O)N=6-55 described at the B3LYP/cc-pVDZ level, we obtain mean absolute errors (MAEs) per H2O monomer of â¼1.0 kcal/mol for two-body expansions, where the benchmark is a B3LYP/cc-pVDZ calculation on the entire cluster. Three- and four-body expansions exhibit MAEs of 0.5 and 0.1 kcal/mol/monomer, respectively, without resort to charge embedding. A generalized many-body expansion truncated at two-body terms [GMBE(2)], using 3-4 H2O molecules per fragment, outperforms all of these methods and affords a MAE of â¼0.02 kcal/mol/monomer, also without charge embedding. GMBE(2) requires significantly fewer (although somewhat larger) subsystem calculations as compared to MBE(4), reducing problems associated with floating-point roundoff errors. When compared to high-quality benchmarks, we find that error cancellation often plays a critical role in the success of MBE(n) calculations, even at the four-body level, as basis-set superposition error can compensate for higher-order polarization interactions. A many-body counterpoise correction is introduced for the GMBE, and its two-body truncation [GMBCP(2)] is found to afford good results without error cancellation. Together with a method such as ωB97X-V/aug-cc-pVTZ that can describe both covalent and non-covalent interactions, the GMBE(2)+GMBCP(2) approach provides an accurate, stable, and tractable approach for large systems.
RESUMO
In designing organic materials for electronics applications, particularly for organic photovoltaics (OPV), the ionization potential (IP) of the donor and the electron affinity (EA) of the acceptor play key roles. This makes OPV design an appealing application for computational chemistry since IPs and EAs are readily calculable from most electronic structure methods. Unfortunately reliable, high-accuracy wave function methods, such as coupled cluster theory with single, double, and perturbative triples [CCSD(T)] in the complete basis set (CBS) limit are too expensive for routine applications to this problem for any but the smallest of systems. One solution is to calibrate approximate, less computationally expensive methods against a database of high-accuracy IP/EA values; however, to our knowledge, no such database exists for systems related to OPV design. The present work is the first of a multipart study whose overarching goal is to determine which computational methods can be used to reliably compute IPs and EAs of electron acceptors. This part introduces a database of 24 known organic electron acceptors and provides high-accuracy vertical IP and EA values expected to be within ±0.03 eV of the true non-relativistic, vertical CCSD(T)/CBS limit. Convergence of IP and EA values toward the CBS limit is studied systematically for the Hartree-Fock, MP2 correlation, and beyond-MP2 coupled cluster contributions to the focal point estimates.
RESUMO
Comparison of ab initio electron-propagator predictions of vertical ionization potentials and electron affinities of organic, acceptor molecules with benchmark calculations based on the basis set-extrapolated, coupled cluster single, double, and perturbative triple substitution method has enabled identification of self-energy approximations with mean, unsigned errors between 0.1 and 0.2 eV. Among the self-energy approximations that neglect off-diagonal elements in the canonical, Hartree-Fock orbital basis, the P3 method for electron affinities, and the P3+ method for ionization potentials provide the best combination of accuracy and computational efficiency. For approximations that consider the full self-energy matrix, the NR2 methods offer the best performance. The P3+ and NR2 methods successfully identify the correct symmetry label of the lowest cationic state in two cases, naphthalenedione and benzoquinone, where some other methods fail.
RESUMO
Electronic structure methods based on low-order "n-body" expansions are an increasingly popular means to defeat the highly nonlinear scaling of ab initio quantum chemistry calculations, taking advantage of the inherently distributable nature of the numerous subsystem calculations. Here, we examine how the finite precision of these subsystem calculations manifests in applications to large systems, in this case, a sequence of water clusters ranging in size up to (H2O)47. Using two different computer implementations of the n-body expansion, one fully integrated into a quantum chemistry program and the other written as a separate driver routine for the same program, we examine the reproducibility of total binding energies as a function of cluster size. The combinatorial nature of the n-body expansion amplifies subtle differences between the two implementations, especially for n ⩾ 4, leading to total energies that differ by as much as several kcal/mol between two implementations of what is ostensibly the same method. This behavior can be understood based on a propagation-of-errors analysis applied to a closed-form expression for the n-body expansion, which is derived here for the first time. Discrepancies between the two implementations arise primarily from the Coulomb self-energy correction that is required when electrostatic embedding charges are implemented by means of an external driver program. For reliable results in large systems, our analysis suggests that script- or driver-based implementations should read binary output files from an electronic structure program, in full double precision, or better yet be fully integrated in a way that avoids the need to compute the aforementioned self-energy. Moreover, four-body and higher-order expansions may be too sensitive to numerical thresholds to be of practical use in large systems.
RESUMO
Conspectus The past 15 years have witnessed an explosion of activity in the field of fragment-based quantum chemistry, whereby ab initio electronic structure calculations are performed on very large systems by decomposing them into a large number of relatively small subsystem calculations and then reassembling the subsystem data in order to approximate supersystem properties. Most of these methods are based, at some level, on the so-called many-body (or "n-body") expansion, which ultimately requires calculations on monomers, dimers, ..., n-mers of fragments. To the extent that a low-order n-body expansion can reproduce supersystem properties, such methods replace an intractable supersystem calculation with a large number of easily distributable subsystem calculations. This holds great promise for performing, for example, "gold standard" CCSD(T) calculations on large molecules, clusters, and condensed-phase systems. The literature is awash in a litany of fragment-based methods, each with their own working equations and terminology, which presents a formidable language barrier to the uninitiated reader. We have sought to unify these methods under a common formalism, by means of a generalized many-body expansion that provides a universal energy formula encompassing not only traditional n-body cluster expansions but also methods designed for macromolecules, in which the supersystem is decomposed into overlapping fragments. This formalism allows various fragment-based methods to be systematically classified, primarily according to how the fragments are constructed and how higher-order n-body interactions are approximated. This classification furthermore suggests systematic ways to improve the accuracy. Whereas n-body approaches have been thoroughly tested at low levels of theory in small noncovalent clusters, we have begun to explore the efficacy of these methods for large systems, with the goal of reproducing benchmark-quality calculations, ideally meaning complete-basis CCSD(T). For high accuracy, it is necessary to deal with basis-set superposition error, and this necessitates the use of many-body counterpoise corrections and electrostatic embedding methods that are stable in large basis sets. Tests on small noncovalent clusters suggest that total energies of complete-basis CCSD(T) quality can indeed be obtained, with dramatic reductions in aggregate computing time. On the other hand, naive applications of low-order n-body expansions may benefit from significant error cancellation, wherein basis-set superposition error partially offsets the effects of higher-order n-body terms, affording fortuitously good results in some cases. Basis sets that afford reasonable results in small clusters behave erratically in larger systems and when high-order n-body expansions are employed. For large systems, and (H2O)Nâ³30 is large enough, the combinatorial nature of the many-body expansion presents the possibility of serious loss-of-precision problems that are not widely appreciated. Tight thresholds are required in the subsystem calculations in order to stave off size-dependent errors, and high-order expansions may be inherently numerically ill-posed. Moreover, commonplace script- or driver-based implementations of the n-body expansion may be especially susceptible to loss-of-precision problems in large systems. These results suggest that the many-body expansion is not yet ready to be treated as a "black-box" quantum chemistry method.
RESUMO
High-accuracy electronic structure calculations with correlated wave functions demand the use of large basis sets and complete-basis extrapolation, but the accuracy of fragment-based quantum chemistry methods has most often been evaluated using double-ζ basis sets, with errors evaluated relative to a supersystem calculation using the same basis set. Here, we examine the convergence towards the basis-set limit of two- and three-body expansions of the energy, for water clusters and ion-water clusters, focusing on calculations at the level of second-order Møller-Plesset perturbation theory (MP2). Several different corrections for basis-set superposition error (BSSE), each consistent with a truncated many-body expansion, are examined as well. We present a careful analysis of how the interplay of errors (from all sources) influences the accuracy of the results. We conclude that fragment-based methods often benefit from error cancellation wherein BSSE offsets both incompleteness of the basis set as well as higher-order many-body effects that are neglected in a truncated many-body expansion. An n-body counterpoise correction facilitates smooth extrapolation to the MP2 basis-set limit, and at n = 3 affords accurate results while requiring calculations in subsystems no larger than trimers.
RESUMO
The traditional many-body expansion-in which a system's energy is expressed in terms of the energies of its constituent monomers, dimers, trimers, etc.-has recently been generalized to the case where the "monomers" (subsystems, or "fragments") overlap. Two such generalizations have been proposed, and here, we compare the two, both formally and numerically. We conclude that the two approaches are distinct, although in many cases they yield comparable and accurate results when truncated at the level of dimers. However, tests on fluoride-water clusters suggest that the approach that we have previously called the "generalized many-body expansion" (GMBE) [J. Chem. Phys.137, 064113 (2012)] is more robust, with respect to the choice of fragments, as compared to an alternative "many overlapping body expansion" [J. Chem. Theory Comput.8, 2669 (2012)]. A more detailed justification for the GMBE is also provided here.
RESUMO
An efficient procedure is introduced to obtain the basis-set limit in electronic structure calculations of large molecular and ionic clusters. This approach is based on a Boys-Bernardi-style counterpoise correction for clusters containing arbitrarily many monomer units, which is rendered computationally feasible by means of a truncated many-body expansion. This affords a tractable way to apply the sequence of correlation-consistent basis sets (aug-cc-pVXZ) to large systems and thereby obtain energies extrapolated to the complete basis set (CBS) limit. A three-body expansion with three-body counterpoise corrections is shown to afford errors of â²0.1-0.2 kcal/mol with respect to traditional MP2/CBS results, even for challenging systems such as fluoride-water clusters. A triples correction, δCCSD(T) = ECCSD(T) - EMP2, can be estimated accurately and efficiently as well. Because the procedure is embarrassingly parallelizable and requires no electronic structure calculations in systems larger than trimers, it is extendible to very large clusters. As compared to traditional CBS extrapolations, computational time is dramatically reduced even without parallelization.
RESUMO
An implementation of Ewald summation for use in mixed quantum mechanics/molecular mechanics (QM/MM) calculations is presented, which builds upon previous work by others that was limited to semi-empirical electronic structure for the QM region. Unlike previous work, our implementation describes the wave function's periodic images using "ChElPG" atomic charges, which are determined by fitting to the QM electrostatic potential evaluated on a real-space grid. This implementation is stable even for large Gaussian basis sets with diffuse exponents, and is thus appropriate when the QM region is described by a correlated wave function. Derivatives of the ChElPG charges with respect to the QM density matrix are a potentially serious bottleneck in this approach, so we introduce a ChElPG algorithm based on atom-centered Lebedev grids. The ChElPG charges thus obtained exhibit good rotational invariance even for sparse grids, enabling significant cost savings. Detailed analysis of the optimal choice of user-selected Ewald parameters, as well as timing breakdowns, is presented.
RESUMO
Fragment-based quantum chemistry methods are a promising route towards massively parallel electronic structure calculations in large systems. Unfortunately, the literature on this topic consists of a bewildering array of different methods, with no clear guiding principles to choose amongst them. Here, we introduce a conceptual framework that unifies many of these ostensibly disparate approaches. The common framework is based upon an approximate supersystem energy formula for a collection of intersecting (i.e., overlapping) fragments. This formula generalizes the traditional many-body expansion to cases where the "bodies" (fragments) share some nuclei in common, and reduces to the traditional many-body expansion for non-overlapping fragments. We illustrate how numerous fragment-based methods fit within this framework. Preliminary applications to molecular and ionic clusters suggest that two-body methods in which dimers are constructed from intersecting fragments may be a route to achieve very high accuracy in fragment-based calculations.
RESUMO
The electronic spectrum of alternant polycyclic aromatic hydrocarbons (PAHs) includes two singlet excited states that are often denoted (1)La and (1)Lb. Time-dependent density functional theory (TD-DFT) affords reasonable excitation energies for the (1)Lb state in such molecules, but often severely underestimates (1)La excitation energies and fails to reproduce observed trends in the (1)La excitation energy as a function of molecular size. Here, we examine the performance of long-range-corrected (LRC) density functionals for the (1)La and (1)Lb states of various PAHs. With an appropriate choice for the Coulomb attenuation parameter, we find that LRC functionals avoid the severe underestimation of the (1)La excitation energies that afflicts other TD-DFT approaches, while errors in the (1)Lb excitation energies are less sensitive to this parameter. This suggests that the (1)La states of certain PAHs exhibit some sort of charge-separated character, consistent with the description of this state within valence-bond theory, but such character proves difficult to identify a priori. We conclude that TD-DFT calculations in medium-size, conjugated organic molecules may involve significant but hard-to-detect errors. Comparison of LRC and non-LRC results is recommended as a qualitative diagnostic.
RESUMO
Recent studies have suggested that octanitrocubane and heptanitrocubane may be two of the most powerful non-nuclear high-energy materials currently known. Progressive substitution of the hydrogen atoms on cubane for nitroso groups is expected to also produce a new potential high-energy material, which should have thermodynamic properties similar to nitrocubane. In this study we predict optimized structures, vibrational frequencies, enthalpies of formation, and specific enthalpies of combustion for a series of nitrosocubanes ranging from mononitrosocubane to octanitrosocubane. Our results indicate, on the basis of the specific enthalpies of combustion alone, that mononitrosocubane should make the best new high-energy material; however, we speculate that the velocity of detonation of octa- and heptanitrosocubane will make them better high-energy materials.