RESUMO
Since the advent of the first computers, chemists have been at the forefront of using computers to understand and solve complex chemical problems. As the hardware and software have evolved, so have the theoretical and computational chemistry methods and algorithms. Parallel computers clearly changed the common computing paradigm in the late 1970s and 80s, and the field has again seen a paradigm shift with the advent of graphical processing units. This review explores the challenges and some of the solutions in transforming software from the terascale to the petascale and now to the upcoming exascale computers. While discussing the field in general, NWChem and its redesign, NWChemEx, will be highlighted as one of the early codesign projects to take advantage of massively parallel computers and emerging software standards to enable large scientific challenges to be tackled.
RESUMO
For many computational chemistry packages, being able to efficiently and effectively scale across an exascale cluster is a heroic feat. Collective experience from the Department of Energy's Exascale Computing Project suggests that achieving exascale performance requires far more planning, design, and optimization than scaling to petascale. In many cases, entire rewrites of software are necessary to address fundamental algorithmic bottlenecks. This in turn requires a tremendous amount of resources and development time, resources that cannot reasonably be afforded by every computational science project. It thus becomes imperative that computational science transition to a more sustainable paradigm. Key to such a paradigm is modular software. While the importance of modular software is widely recognized, what is perhaps not so widely appreciated is the effort still required to leverage modular software in a sustainable manner. The present manuscript introduces PluginPlay, https://github.com/NWChemEx-Project/PluginPlay, an inversion-of-control framework designed to facilitate developing, maintaining, and sustaining modular scientific software packages. This manuscript focuses on the design aspects of PluginPlay and how they specifically influence the performance of the resulting package. Although, PluginPlay serves as the framework for the NWChemEx package, PluginPlay is not tied to NWChemEx or even computational chemistry. We thus anticipate PluginPlay to prove to be a generally useful tool for a number of computational science packages looking to transition to the exascale.
RESUMO
Despite the recent availability of vaccines against the acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the search for inhibitory therapeutic agents has assumed importance especially in the context of emerging new viral variants. In this paper, we describe the discovery of a novel noncovalent small-molecule inhibitor, MCULE-5948770040, that binds to and inhibits the SARS-Cov-2 main protease (Mpro) by employing a scalable high-throughput virtual screening (HTVS) framework and a targeted compound library of over 6.5 million molecules that could be readily ordered and purchased. Our HTVS framework leverages the U.S. supercomputing infrastructure achieving nearly 91% resource utilization and nearly 126 million docking calculations per hour. Downstream biochemical assays validate this Mpro inhibitor with an inhibition constant (Ki) of 2.9 µM (95% CI 2.2, 4.0). Furthermore, using room-temperature X-ray crystallography, we show that MCULE-5948770040 binds to a cleft in the primary binding site of Mpro forming stable hydrogen bond and hydrophobic interactions. We then used multiple µs-time scale molecular dynamics (MD) simulations and machine learning (ML) techniques to elucidate how the bound ligand alters the conformational states accessed by Mpro, involving motions both proximal and distal to the binding site. Together, our results demonstrate how MCULE-5948770040 inhibits Mpro and offers a springboard for further therapeutic design.
Assuntos
COVID-19 , Inibidores de Proteases , Antivirais , Proteases 3C de Coronavírus , Humanos , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Ácido Orótico/análogos & derivados , Piperazinas , SARS-CoV-2RESUMO
Kinetics of a reaction network that follows mass-action rate laws can be described with a system of ordinary differential equations (ODEs) with polynomial right-hand side. However, it is challenging to derive such kinetic differential equations from transient kinetic data without knowing the reaction network, especially when the data are incomplete due to experimental limitations. We introduce a program, PolyODENet, toward this goal. Based on the machine-learning method Neural ODE, PolyODENet defines a generative model and predicts concentrations at arbitrary time. As such, it is possible to include unmeasurable intermediate species in the kinetic equations. Importantly, we have implemented various measures to apply physical constraints and chemical knowledge in the training to regularize the solution space. Using simple catalytic reaction models, we demonstrate that PolyODENet can predict reaction profiles of unknown species and doing so even reveal hidden parts of reaction mechanisms.
Assuntos
Algoritmos , CinéticaRESUMO
Synchrotron X-ray-based in situ metrology is advantageous for monitoring the synthesis of battery materials, offering high throughput, high spatial and temporal resolution, and chemical sensitivity. However, the rapid generation of massive data poses a challenge to on-site, on-the-fly analysis needed for real-time process monitoring. Here, a weighted lagged cross-correlation (WLCC) similarity approach is presented for automated data analysis, which merges with in situ synchrotron X-ray diffraction metrology to monitor the calcination process of the archetypal nickel-based cathode, LiNiO2. The WLCC approach, incorporating variables that account for peak shifts and width changes associated with structural transformations, enables rapid extraction of phase progression within 10 seconds from tens of diffraction patterns. Details are captured, from initial precursors to intermediates and the final layered LiNiO2, providing information for agile on-site adjustments during experiments and complementing post hoc diffraction analysis by offering insights into early-stage phase nucleation and growth. Expanding this data-powered platform paves the way for real time calcination process monitoring and control, which is pivotal to quality control in battery cathode manufacturing.
RESUMO
Heme has a critical role in the chemical framework of the cell as an essential protein cofactor and signaling molecule that controls diverse processes and molecular interactions. Using a phylogenomics-based approach and complementary structural techniques, we identify a family of dimeric hemoproteins comprising a domain of unknown function DUF2470. The heme iron is axially coordinated by two zinc-bound histidine residues, forming a distinct two-fold symmetric zinc-histidine-iron-histidine-zinc site. Together with structure-guided in vitro and in vivo experiments, we further demonstrate the existence of a functional link between heme binding by Dri1 (Domain related to iron 1, formerly ssr1698) and post-translational regulation of succinate dehydrogenase in the cyanobacterium Synechocystis, suggesting an iron-dependent regulatory link between photosynthesis and respiration. Given the ubiquity of proteins containing homologous domains and connections to heme metabolism across eukaryotes and prokaryotes, we propose that DRI (Domain Related to Iron; formerly DUF2470) functions at the molecular level as a heme-dependent regulatory domain.
Assuntos
Hemeproteínas , Synechocystis , Heme , Zinco , Histidina , Hemeproteínas/genética , Synechocystis/genética , Carbono , FerroRESUMO
Metal homeostasis has evolved to tightly modulate the availability of metals within the cell, avoiding cytotoxic interactions due to excess and protein inactivity due to deficiency. Even in the presence of homeostatic processes, however, low bioavailability of these essential metal nutrients in soils can negatively impact crop health and yield. While research has largely focused on how plants assimilate metals, acclimation to metal-limited environments requires a suite of strategies that are not necessarily involved in metal transport across membranes. The identification of these mechanisms provides a new opportunity to improve metal-use efficiency and develop plant foodstuffs with increased concentrations of bioavailable metal nutrients. Here, we investigate the function of two distinct subfamilies of the nucleotide-dependent metallochaperones (NMCs), named ZNG1 and ZNG2, that are found in plants, using Arabidopsis thaliana as a reference organism. AtZNG1 (AT1G26520) is an ortholog of human and fungal ZNG1, and like its previously characterized eukaryotic relatives, localizes to the cytosol and physically interacts with methionine aminopeptidase type I (AtMAP1A). Analysis of AtZNG1, AtMAP1A, AtMAP2A, and AtMAP2B transgenic mutants are consistent with the role of Arabidopsis ZNG1 as a Zn transferase for AtMAP1A, as previously described in yeast and zebrafish. Structural modeling reveals a flexible cysteine-rich loop that we hypothesize enables direct transfer of Zn from AtZNG1 to AtMAP1A during GTP hydrolysis. Based on proteomics and transcriptomics, loss of this ancient and conserved mechanism has pleiotropic consequences impacting the expression of hundreds of genes, including those involved in photosynthesis and vesicle transport. Members of the plant-specific family of NMCs, ZNG2A1 (AT1G80480) and ZNG2A2 (AT1G15730), are also required during Zn deficiency, but their target protein(s) remain to be discovered. RNA-seq analyses reveal wide-ranging impacts across the cell when the genes encoding these plastid-localized NMCs are disrupted.
RESUMO
The recently proposed universal state-selective (USS) corrections [K. Kowalski, J. Chem. Phys. 134, 194107 (2011)] to approximate multi-reference coupled-cluster (MRCC) energies can be commonly applied to any type of MRCC theory based on the Jeziorski-Monkhorst [B. Jeziorski and H. J. Monkhorst, Phys. Rev. A 24, 1668 (1981)] exponential ansatz. In this paper we report on the performance of a simple USS correction to the Brillouin-Wigner and Mukherjee's MRCC approaches employing single and double excitations (USS-BW-MRCCSD and USS-Mk-MRCCSD). It is shown that the USS-BW-MRCCSD correction, which employs the manifold of single and double excitations, can be related to a posteriori corrections utilized in routine BW-MRCCSD calculations. In several benchmark calculations we compare the USS-BW-MRCCSD and USS-Mk-MRCCSD results with the results obtained with the full configuration interaction method.
RESUMO
In this paper we discuss the performance of the non-iterative state-specific multireference coupled cluster (SS-MRCC) methods accounting for the effect of triply excited cluster amplitudes. The corrections to the Brillouin-Wigner and Mukherjee's MRCC models based on the manifold of singly and doubly excited cluster amplitudes (BW-MRCCSD and Mk-MRCCSD, respectively) are tested and compared with exact full configuration interaction results for small systems (H(2)O, N(2), and Be(3)). For the larger systems (naphthyne isomers) the BW-MRCC and Mk-MRCC methods with iterative singles, doubles, and non-iterative triples (BW-MRCCSD(T) and Mk-MRCCSD(T)) are compared against the results obtained with single reference coupled cluster methods. We also report on the parallel performance of the non-iterative implementations based on the use of processor groups.
RESUMO
Parallel hardware has become readily available to the computational chemistry research community. This perspective will review the current state of parallel computational chemistry software utilizing high-performance parallel computing platforms. Hardware and software trends and their effect on quantum chemistry methodologies, algorithms, and software development will also be discussed.
RESUMO
The predominance of Kohn-Sham density functional theory (KS-DFT) for the theoretical treatment of large experimentally relevant systems in molecular chemistry and materials science relies primarily on the existence of efficient software implementations which are capable of leveraging the latest advances in modern high-performance computing (HPC). With recent trends in HPC leading toward increasing reliance on heterogeneous accelerator-based architectures such as graphics processing units (GPU), existing code bases must embrace these architectural advances to maintain the high levels of performance that have come to be expected for these methods. In this work, we purpose a three-level parallelism scheme for the distributed numerical integration of the exchange-correlation (XC) potential in the Gaussian basis set discretization of the Kohn-Sham equations on large computing clusters consisting of multiple GPUs per compute node. In addition, we purpose and demonstrate the efficacy of the use of batched kernels, including batched level-3 BLAS operations, in achieving high levels of performance on the GPU. We demonstrate the performance and scalability of the implementation of the purposed method in the NWChemEx software package by comparing to the existing scalable CPU XC integration in NWChem.
RESUMO
We have investigated the description of excited state relaxation in naked and hydrated TiO2 nanoparticles using Time-Dependent Density Functional Theory (TD-DFT) with three common hybrid exchange-correlation (XC) potentials: B3LYP, CAM-B3LYP and BHLYP. Use of TD-CAM-B3LYP and TD-BHLYP yields qualitatively similar results for all structures, which are also consistent with predictions of coupled-cluster theory for small particles. TD-B3LYP, in contrast, is found to make rather different predictions; including apparent conical intersections for certain particles that are not observed with TD-CAM-B3LYP nor with TD-BHLYP. In line with our previous observations for vertical excitations, the issue with TD-B3LYP appears to be the inherent tendency of TD-B3LYP, and other XC potentials with no or a low percentage of Hartree-Fock like exchange, to spuriously stabilize the energy of charge-transfer (CT) states. Even in the case of hydrated particles, for which vertical excitations are generally well described with all XC potentials, the use of TD-B3LYP appears to result in CT problems during excited state relaxation for certain particles. We hypothesize that the spurious stabilization of CT states by TD-B3LYP even may drive the excited state optimizations to different excited state geometries from those obtained using TD-CAM-B3LYP or TD-BHLYP. Finally, focusing on the TD-CAM-B3LYP and TD-BHLYP results, excited state relaxation in small naked and hydrated TiO2 nanoparticles is predicted to be associated with a large Stokes' shift.
RESUMO
A parallel implementation of analytical time-dependent density functional theory gradients is presented for the quantum chemistry program NWChem. The implementation is based on the Lagrangian approach developed by Furche and Ahlrichs. To validate our implementation, we first calculate the Stokes shifts for a range of organic dye molecules using a diverse set of exchange-correlation functionals (traditional density functionals, global hybrids, and range-separated hybrids) followed by simulations of the one-photon absorption and resonance Raman scattering spectrum of the phenoxyl radical, the well-studied dye molecule rhodamine 6G, and a molecular host-guest complex (TTFâCBPQT(4+)). The study of organic dye molecules illustrates that B3LYP and CAM-B3LYP generally give the best agreement with experimentally determined Stokes shifts unless the excited state is a charge transfer state. Absorption, resonance Raman, and fluorescence simulations for the phenoxyl radical indicate that explicit solvation may be required for accurate characterization. For the host-guest complex and rhodamine 6G, it is demonstrated that absorption spectra can be simulated in good agreement with experimental data for most exchange-correlation functionals. However, because one-photon absorption spectra generally lack well-resolved vibrational features, resonance Raman simulations are necessary to evaluate the accuracy of the exchange-correlation functional for describing a potential energy surface.
RESUMO
High performance computing platforms are expected to deliver 10(18) floating operations per second by the year 2022 through the deployment of millions of cores. Even if every core is highly reliable the sheer number of them will mean that the mean time between failures will become so short that most application runs will suffer at least one fault. In particular soft errors caused by intermittent incorrect behavior of the hardware are a concern as they lead to silent data corruption. In this paper we investigate the impact of soft errors on optimization algorithms using Hartree-Fock as a particular example. Optimization algorithms iteratively reduce the error in the initial guess to reach the intended solution. Therefore they may intuitively appear to be resilient to soft errors. Our results show that this is true for soft errors of small magnitudes but not for large errors. We suggest error detection and correction mechanisms for different classes of data structures. The results obtained with these mechanisms indicate that we can correct more than 95% of the soft errors at moderate increases in the computational cost.
RESUMO
A novel parallel algorithm for noniterative multireference coupled cluster (MRCC) theories, which merges recently introduced reference-level parallelism (RLP) [Bhaskaran-Nair, K.; Brabec, J.; Aprà, E.; van Dam, H. J. J.; Pittner, J.; Kowalski, K. J. Chem. Phys.2012, 137, 094112] with the possibility of accelerating numerical calculations using graphics processing units (GPUs) is presented. We discuss the performance of this approach applied to the MRCCSD(T) method (iterative singles and doubles and perturbative triples), where the corrections due to triples are added to the diagonal elements of the MRCCSD effective Hamiltonian matrix. The performance of the combined RLP/GPU algorithm is illustrated on the example of the Brillouin-Wigner (BW) and Mukherjee (Mk) state-specific MRCCSD(T) formulations.
RESUMO
A novel algorithm for implementing a general type of multireference coupled-cluster (MRCC) theory based on the Jeziorski-Monkhorst exponential ansatz [Jeziorski, B.; Monkhorst, H. J. Phys. Rev. A1981, 24, 1668] is introduced. The proposed algorithm utilizes processor groups to calculate the equations for the MRCC amplitudes. In the basic formulation, each processor group constructs the equations related to a specific subset of references. By flexible choice of processor groups and subset of reference-specific sufficiency conditions designated to a given group, one can ensure optimum utilization of available computing resources. The performance of this algorithm is illustrated on the examples of the Brillouin-Wigner and Mukherjee MRCC methods with singles and doubles (BW-MRCCSD and Mk-MRCCSD). A significant improvement in scalability and in reduction of time to solution is reported with respect to recently reported parallel implementation of the BW-MRCCSD formalism [Brabec, J.; van Dam, H. J. J.; Kowalski, K.; Pittner, J. Chem. Phys. Lett.2011, 514, 347].
RESUMO
In the past couple of decades, the massive computational power provided by the most modern supercomputers has resulted in simulation of higher-order computational chemistry methods, previously considered intractable. As the system sizes continue to increase, the computational chemistry domain continues to escalate this trend using parallel computing with programming models such as Message Passing Interface (MPI) and Partitioned Global Address Space (PGAS) programming models such as Global Arrays. The ever increasing scale of these supercomputers comes at a cost of reduced Mean Time Between Failures (MTBF), currently on the order of days and projected to be on the order of hours for upcoming extreme scale systems. While traditional disk-based check pointing methods are ubiquitous for storing intermediate solutions, they suffer from high overhead of writing and recovering from checkpoints. In practice, checkpointing itself often brings the system down. Clearly, methods beyond checkpointing are imperative to handling the aggravating issue of reducing MTBF. In this paper, we address this challenge by designing and implementing an efficient fault tolerant version of the Coupled Cluster (CC) method with NWChem, using in-memory data redundancy. We present the challenges associated with our design, including an efficient data storage model, maintenance of at least one consistent data copy, and the recovery process. Our performance evaluation without faults shows that the current design exhibits a small overhead. In the presence of a simulated fault, the proposed design incurs negligible overhead in comparison to the state of the art implementation without faults.
RESUMO
The SO(2)-binding properties of a series of η(6),η(1)-NCN-pincer ruthenium platinum complexes (NCN = 2,6-bis[(dimethylamino)methyl]phenyl anion) have been studied by both UV-visible spectroscopy and theoretical calculations. When an electron-withdrawing [Ru(C(5)R(5))](+) fragment (R = H or Me) is η(6)-coordinated to the phenyl ring of the NCN-pincer platinum fragment (cf. [2](+) and [3](+), see Scheme 1), the characteristic orange coloration (pointing to η(1)- SO(2) binding to Pt) of a solution of the parent NCN-pincer platinum complex 1 in dichloromethane upon SO(2)-bubbling is not observed. However, when the ruthenium center is η(6)-coordinated to a phenyl substituent linked in para-position to the carbon-to-platinum bond, i.e. complex [4](+), the SO(2)-binding property of the NCN-platinum center seems to be retained, as bubbling SO(2) into a solution of the latter complex produces the characteristic orange color. We performed theoretical calculations at the MP2 level of approximation and TD-DFT studies, which enabled us to interpret the absence of color change in the case of [2](+) as an absence of coordination of SO(2) to platinum. We analyze this absence or weaker SO(2)-coordination in dichloromethane to be a consequence of the relative electron-poorness of the platinum center in the respective η(6)-ruthenium coordinated NCN-pincer platinum complexes, that leads to a lower binding energy and an elongated calculated Pt-S bond distance. We also discuss the effects of electrostatic interactions in these cationic systems, which also seems to play a destabilizing role for complex [2(SO(2))](+).