Your browser doesn't support javascript.
loading
Hybrid programming-model strategies for GPU offloading of electronic structure calculation kernels.
Fattebert, Jean-Luc; Negre, Christian F A; Finkelstein, Joshua; Mohd-Yusof, Jamaludin; Osei-Kuffuor, Daniel; Wall, Michael E; Zhang, Yu; Bock, Nicolas; Mniszewski, Susan M.
Affiliation
  • Fattebert JL; Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA.
  • Negre CFA; Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
  • Finkelstein J; Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
  • Mohd-Yusof J; Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
  • Osei-Kuffuor D; Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, California 94550, USA.
  • Wall ME; Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
  • Zhang Y; Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
  • Bock N; Canonical USA Inc., Eatontown, New Jersey 07724, USA.
  • Mniszewski SM; Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.
J Chem Phys ; 160(12)2024 Mar 28.
Article in En | MEDLINE | ID: mdl-38551311
ABSTRACT
To address the challenge of performance portability and facilitate the implementation of electronic structure solvers, we developed the basic matrix library (BML) and Parallel, Rapid O(N), and Graph-based Recursive Electronic Structure Solver (PROGRESS) library. The BML implements linear algebra operations necessary for electronic structure kernels using a unified user interface for various matrix formats (dense and sparse) and architectures (CPUs and GPUs). Focusing on density functional theory and tight-binding models, PROGRESS implements several solvers for computing the single-particle density matrix and relies on BML. In this paper, we describe the general strategies used for these implementations on various computer architectures, using OpenMP target functionalities on GPUs, in conjunction with third-party libraries to handle performance critical numerical kernels. We demonstrate the portability of this approach and its performance in benchmark problems.

Full text: 1 Database: MEDLINE Language: En Year: 2024 Type: Article

Full text: 1 Database: MEDLINE Language: En Year: 2024 Type: Article