RESUMO
There is a growing desire for inter-package modularity within the chemistry software community to reuse encapsulated code units across a variety of software packages. Most comprehensive efforts at achieving inter-package modularity will quickly run afoul of a very practical problem, being able to cohesively build the modules. Writing and maintaining build systems has long been an issue for many scientific software packages that rely on compiled languages such as C/C++. The push for inter-package modularity compounds this issue by additionally requiring binary artifacts from disparate developers to interoperate at a binary level. Thankfully, the de facto build tool for C/C++, CMake, is more than capable of supporting the myriad of edge cases that complicate writing robust build systems. Unfortunately, writing and maintaining a robust CMake build system can be a laborious endeavor because CMake provides few abstractions to aid the developer. The need to significantly simplify the process of writing robust CMake-based build systems, especially in inter-package builds, motivated us to write CMaize. In addition to describing the architecture and design of CMaize, the article also demonstrates how CMaize is used in production-level software.
RESUMO
The new LOGKPREDICT program integrates HostDesigner molecular design software with the machine learning (ML) program Chemprop. By supplying HostDesigner with predicted log K values, LOGKPREDICT enhances the computer-aided molecular design process by ranking ligands directly by metal-ligand binding strength. Harnessing reliable experimental data from a historic National Institute of Standards and Technology (NIST) database and data from the International Union of Pure and Applied Chemistry (IUPAC), we train message passing neural net algorithms. The multi-metal NIST-based ML model has a root mean square error (RMSE) of 0.629 ± 0.044 (R2 of 0.960 ± 0.006), while two versions of lanthanide-only IUPAC-based ML models have, respectively, RMSE of 0.764 ± 0.073 (R2 of 0.976 ± 0.005) and 0.757 ± 0.071 (R2 of 0.959 ± 0.007). For relative log K predictions on an out-of-sample set of six ligands, demonstrating metal ion selectivity, the RMSE value reaches a commendably low 0.25. We showcase the use of LOGKPREDICT in identifying ligands with high selectivity for lanthanides in aqueous solutions, a finding supported by recent experimental evidence. We also predict new ligands yet to be verified experimentally. Therefore, our ML models implemented through LOGKPREDICT and interfaced with the ligand design software HostDesigner pave the way for designing new ligands with predetermined selectivity for competing metal ions in an aqueous solution.
RESUMO
Since the advent of the first computers, chemists have been at the forefront of using computers to understand and solve complex chemical problems. As the hardware and software have evolved, so have the theoretical and computational chemistry methods and algorithms. Parallel computers clearly changed the common computing paradigm in the late 1970s and 80s, and the field has again seen a paradigm shift with the advent of graphical processing units. This review explores the challenges and some of the solutions in transforming software from the terascale to the petascale and now to the upcoming exascale computers. While discussing the field in general, NWChem and its redesign, NWChemEx, will be highlighted as one of the early codesign projects to take advantage of massively parallel computers and emerging software standards to enable large scientific challenges to be tackled.
RESUMO
Computational modeling and simulation have become indispensable scientific tools in virtually all areas of chemical, biomolecular, and materials systems research. Computation can provide unique and detailed atomic level information that is difficult or impossible to obtain through analytical theories and experimental investigations. In addition, recent advances in micro-electronics have resulted in computer architectures with unprecedented computational capabilities, from the largest supercomputers to common desktop computers. Combined with the development of new computational domain science methodologies and novel programming models and techniques, this has resulted in modeling and simulation resources capable of providing results at or better than experimental chemical accuracy and for systems in increasingly realistic chemical environments.
RESUMO
With the growing reliance of modern supercomputers on accelerator-based architecture such a graphics processing units (GPUs), the development and optimization of electronic structure methods to exploit these massively parallel resources has become a recent priority. While significant strides have been made in the development GPU accelerated, distributed memory algorithms for many modern electronic structure methods, the primary focus of GPU development for Gaussian basis atomic orbital methods has been for shared memory systems with only a handful of examples pursing massive parallelism. In the present work, we present a set of distributed memory algorithms for the evaluation of the Coulomb and exact exchange matrices for hybrid Kohn-Sham DFT with Gaussian basis sets via direct density-fitted (DF-J-Engine) and seminumerical (sn-K) methods, respectively. The absolute performance and strong scalability of the developed methods are demonstrated on systems ranging from a few hundred to over one thousand atoms using up to 128 NVIDIA A100 GPUs on the Perlmutter supercomputer.
Assuntos
Algoritmos , Gráficos por Computador , Teoria da Densidade FuncionalRESUMO
For many computational chemistry packages, being able to efficiently and effectively scale across an exascale cluster is a heroic feat. Collective experience from the Department of Energy's Exascale Computing Project suggests that achieving exascale performance requires far more planning, design, and optimization than scaling to petascale. In many cases, entire rewrites of software are necessary to address fundamental algorithmic bottlenecks. This in turn requires a tremendous amount of resources and development time, resources that cannot reasonably be afforded by every computational science project. It thus becomes imperative that computational science transition to a more sustainable paradigm. Key to such a paradigm is modular software. While the importance of modular software is widely recognized, what is perhaps not so widely appreciated is the effort still required to leverage modular software in a sustainable manner. The present manuscript introduces PluginPlay, https://github.com/NWChemEx-Project/PluginPlay, an inversion-of-control framework designed to facilitate developing, maintaining, and sustaining modular scientific software packages. This manuscript focuses on the design aspects of PluginPlay and how they specifically influence the performance of the resulting package. Although, PluginPlay serves as the framework for the NWChemEx package, PluginPlay is not tied to NWChemEx or even computational chemistry. We thus anticipate PluginPlay to prove to be a generally useful tool for a number of computational science packages looking to transition to the exascale.
RESUMO
Chemisorbed species can enhance the fluxional dynamics of nanostructured metal surfaces which has implications for applications such as catalysis. Scanning tunneling microscopy studies at room temperature reveal that the presence of adsorbed sulfur (S) greatly enhances the decay rate of 2D Au islands in the vicinity of extended step edges on Au(111). This enhancement is already significant at S coverages, θS , of a few hundredths of a monolayer (ML), and is most pronounced for 0.1-0.3â ML where the decay rate is increased by a factor of around 30. For θS close to saturation at about 0.6â ML, sulfur induces pitting and reconstruction of the entire surface, and Au islands are stabilized. Enhanced coarsening at lower θS is attributed to the formation and diffusion across terraces of Au-S complexes, particularly AuS2 and Au4 S4 , with some lesser contribution from Au3 S4 . This picture is supported by density functional theory analysis of complex formation energies and diffusion barriers.
RESUMO
Projector-based embedding is a relatively recent addition to the collection of methods that seek to utilize chemical locality to provide improved computational efficiency. This work considers the interactions between the different proposed procedures for this method and their effects on the accuracy of the results. The interplay between the embedded background, projector type, partitioning scheme, and level of atomic orbital (AO) truncation are investigated on a selection of reactions from the literature. The Huzinaga projection approach proves to be more reliable than the level-shift projection when paired with other procedural options. Active subsystem partitioning from the subsystem projected AO decomposition (SPADE) procedure proves slightly better than the combination of Pipek-Mezey localization and Mulliken population screening (PMM). Along with these two options, a new partitioning criteria is proposed based on subsystem von Neumann entropy and the related subsystem orbital occupancy. This new method overlaps with the previous PMM method, but the screening process is computationally simpler. Finally, AO truncation proves to be a robust option for the tested systems when paired with the Huzinaga projection, with satisfactory results being acquired at even the most severe truncation level.
RESUMO
Community efforts in the computational molecular sciences (CMS) are evolving toward modular, open, and interoperable interfaces that work with existing community codes to provide more functionality and composability than could be achieved with a single program. The Quantum Chemistry Common Driver and Databases (QCDB) project provides such capability through an application programming interface (API) that facilitates interoperability across multiple quantum chemistry software packages. In tandem with the Molecular Sciences Software Institute and their Quantum Chemistry Archive ecosystem, the unique functionalities of several CMS programs are integrated, including CFOUR, GAMESS, NWChem, OpenMM, Psi4, Qcore, TeraChem, and Turbomole, to provide common computational functions, i.e., energy, gradient, and Hessian computations as well as molecular properties such as atomic charges and vibrational frequency analysis. Both standard users and power users benefit from adopting these APIs as they lower the language barrier of input styles and enable a standard layout of variables and data. These designs allow end-to-end interoperable programming of complex computations and provide best practices options by default.
RESUMO
As noted in Wikipedia, skin in the game refers to having 'incurred risk by being involved in achieving a goal', where 'skin is a synecdoche for the person involved, and game is the metaphor for actions on the field of play under discussion'. For exascale applications under development in the US Department of Energy Exascale Computing Project, nothing could be more apt, with the skin being exascale applications and the game being delivering comprehensive science-based computational applications that effectively exploit exascale high-performance computing technologies to provide breakthrough modelling and simulation and data science solutions. These solutions will yield high-confidence insights and answers to the most critical problems and challenges for the USA in scientific discovery, national security, energy assurance, economic competitiveness and advanced healthcare. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'.
RESUMO
The purpose of this work is to evaluate the efficacy of oversubscription, at the 1n, 2n, and 3n levels for n physical cores, on semi-direct MP2 methods within NWChem when using two and three Intel nodes. Semi-direct MP2 energy and gradient calculations were performed on chemical systems ranging from 824 to 1626 basis functions using the cc-pVDZ basis set. Wall times for semi-direct MP2 energies were reduced by as much as 36% using two nodes and 44% using three nodes compared to no oversubscription. Total energy consumed by the CPU and DRAM was also reduced by as much as 12% using two nodes and as much as 20% using three nodes when oversubscribing. MP2 gradient wall times improved by as much as 16% using two nodes and 18% using three nodes compared to execution at the 1n level; however, energy savings were insignificant. Intel performance-counter data show a strong correlation between total wall time saved and less time spent in the idle state, indicating a more efficient use of the processors when oversubscribing. © 2019 Wiley Periodicals, Inc.
RESUMO
The Basis Set Exchange (BSE) has been a prominent fixture in the quantum chemistry community. First publicly available in 2007, it is recognized by both users and basis set creators as the de facto source for information related to basis sets. This popular resource has been rewritten, utilizing modern software design and best practices. The basis set data has been separated into a stand-alone library with an accessible API, and the Web site has been updated to use the current generation of web development libraries. The general layout and workflow of the Web site is preserved, while helpful features requested by the user community have been added. Overall, this design should increase adaptability and lend itself well into the future as a dependable resource for the computational chemistry community. This article will discuss the decision to rewrite the BSE, the new architecture and design, and the new features that have been added.
Assuntos
Química Computacional/métodos , Teoria Quântica , Software , Internet , Linguagens de Programação , Design de Software , Interface Usuário-Computador , Fluxo de TrabalhoRESUMO
Experimental data from low-temperature Scanning Tunneling Microscopy (LTSTM) studies on coinage metal surfaces with very low coverages of S is providing new insights into metal-S interactions. A previous LTSTM study for Cu(100), and a new analysis reported here for Ag(100), both indicate no metal-sulfur complex formation, but an Au4S5 complex was observed previously on Au(100). In marked contrast, various complexes have been proposed and/or observed on Ag(111) and Cu(111), but not on Au(111). Also, exposure to trace amounts of S appears to enhance mass transport far more dramatically on (111) than on (100) surfaces for Cu and Ag, a feature tied to the propensity for complex formation. Motivated by these observations, we present a comprehensive assessment at the level of DFT to assess the existence and stability of complexes on (100) surfaces, and compare results with previous analyses for (111) surfaces. Consistent with experiment, our DFT analysis finds no stable complexes on Ag(100) and Cu(100), but several exist for Au(100). In addition, we systematically relate stability for adsorbed and gas-phase species within the framework of Hess's law. We thereby provide key insight into the various energetic contributions to stability which in turn elucidates the difference in behavior between (100) and (111) surfaces.
RESUMO
Correction for 'Escape of anions from geminate recombination in THF due to charge delocalization' by Hung-Cheng Chen et al., Phys. Chem. Chem. Phys., 2017, 19, 32272-32285.
RESUMO
The field of computational molecular sciences (CMSs) has made innumerable contributions to the understanding of the molecular phenomena that underlie and control chemical processes, which is manifested in a large number of community software projects and codes. The CMS community is now poised to take the next transformative steps of better training in modern software design and engineering methods and tools, increasing interoperability through more systematic adoption of agreed upon standards and accepted best-practices, overcoming unnecessary redundancy in software effort along with greater reproducibility, and increasing the deployment of new software onto hardware platforms from in-house clusters to mid-range computing systems through to modern supercomputers. This in turn will have future impact on the software that will be created to address grand challenge science that we illustrate here: the formulation of diverse catalysts, descriptions of long-range charge and excitation transfer, and development of structural ensembles for intrinsically disordered proteins.
RESUMO
In this work, the effect of oversubscription is evaluated, via calling 2n, 3n, or 4n processes for n physical cores, on semi-direct MP2 energy and gradient calculations and RI-MP2 energy calculations with the cc-pVTZ basis using NWChem. Results indicate that on both Intel and AMD platforms, oversubscription reduces total time to solution on average for semi-direct MP2 energy calculations by 25-45% and reduces total energy consumed by the CPU and DRAM on average by 10-15% on the Intel platform. Semi-direct gradient time to solution is shortened on average by 8-15% and energy consumption is decreased by 5-10%. Linear regression analysis shows a strong correlation between time to solution and total energy consumed. Oversubscribing during RI-MP2 calculations results in performance degradations of 30-50% at the 4n level. © 2017 Wiley Periodicals, Inc.
RESUMO
A newly created object-oriented program for automating the process of fitting molecular-mechanics parameters to ab initio data, termed ParFit, is presented. ParFit uses a hybrid of deterministic and stochastic genetic algorithms. ParFit can simultaneously handle several molecular-mechanics parameters in multiple molecules and can also apply symmetric and antisymmetric constraints on the optimized parameters. The simultaneous handling of several molecules enhances the transferability of the fitted parameters. ParFit is written in Python, uses a rich set of standard and nonstandard Python libraries, and can be run in parallel on multicore computer systems. As an example, a series of phosphine oxides, important for metal extraction chemistry, are parametrized using ParFit. ParFit is in an open source program available for free on GitHub ( https://github.com/fzahari/ParFit ).
Assuntos
Linguagens de Programação , Teoria Quântica , Estatística como Assunto/métodos , AlgoritmosRESUMO
Geminate recombination of 24 radical anions (MË-) with solvated protons (RH2+) was studied in tetrahydrofuran (THF) with pulse radiolysis. The recombination has two steps: (1) diffusion of MË- and RH2+ together to form intimate (contact and solvent separated) ion pairs, driven by Coulomb attraction; (2) annihilation of anions due to proton transfer (PT) from RH2+ to MË-. The non-exponential time-dependence of the geminate diffusion was determined. For all molecules protonated on O or N atoms the subsequent PT step is too fast (<0.2 ns) to measure, except for the anion of TCNE which did not undergo proton transfer. PT to C atoms was as slow as 70 ns and was always slow enough to be observable. A possible effect of charge delocalization on the PT rates could not be clearly separated from other factors. For 21 of the 24 molecules studied here, a free ion yield (71.6 ± 6.2 nmol J-1) comprising â¼29% of the total, was formed. This yield of "Type I" free ions is independent of the PT rate because it arises entirely by escape from the initial distribution of ion pair distances without forming intimate ion pairs. Three anions of oligo(9,9-dihexyl)fluorenes, FnË- (n = 2-4) were able to escape from intimate ion-pairs to form additional yields of "Type II" free ions with escape rate constants near 3 × 106 s-1. These experiments find no evidence for an inverted region for proton transfer.
RESUMO
Knowing the tautomeric form of malonic acid (MA) in concentrated particles is critical to understanding its effect on the atmosphere. Energies and vibrational modes of hydrated MA particles were calculated using density functional theory (DFT) at the B3LYP/6-31G(d,p) level and the effective fragment potential (EFP) method. Visualization of the keto and enol isomer vibrational modes enabled the assignment of keto isomer peaks in the 1710-1750 cm-1 range, and previously unidentified experimental IR peaks in the 1690-1710 cm-1 can now be attributed to the enol isomer. Comparison of calculated spectra of pure hydrated enol or keto isomers confirm recent experimental evidence, presented by Ghorai et al. ( J. Phys. Chem. A 2011 , 115 , 4373 - 4380 ) of a shift in the keto-enol tautomer equilibrium when MA exists as concentrated particles.