Argonne Leadership Computing Facility Researchers Work to Port Quantum Simulation Codes to GPUs
(ArgonneLeadershipComputingFacility) As part of a new series aimed at sharing best practices in preparing applications for Aurora, Argonne Leadership Computing Facility (ALCF) facility researchers are working to port quantum simulation code to GPUs. ALFC is highlighting these researchers’ efforts to optimize codes to run efficiently on graphics processing units.
Argonne computational scientist Ye Luo and ALCF postdoctoral appointee Pankaj Rajak must prepare QXMD, a Fortran-based scalable quantum molecular dynamics code, in this manner. QXMD emerged from an Aurora Early Science Program project, “Metascalable Layered Materials Genome,” led by Aiichiro Nakano of the University of Southern California and aimed at readying materials science for the delivery of the ALCF’s exascale machine, Aurora. The simulations produced by the code explore nonadiabatic quantum molecular dynamics, which sits at the nexus of physics, chemistry, and materials science; these types of quantum simulations are extremely computationally expensive.
Because no prior GPU version of the code existed, the team had to determine a path to enabling the code with performance portable GPU acceleration without interrupting scientific production.
Once Intel compilers became available, they began validating it with progressive complexity. The team keeps several versions of the code. Earlier ones are simpler, with fewer OpenMP offload features used, and demand less of the compiler, whereas later versions integrate more offload regions and advanced OpenMP features so as to make more challenging demands of the compiler. This allows them to work by solving smaller problems in a piecemeal fashion while also retaining the ability to stress-test the compiler. Meanwhile, the full QXMD program is used to validate the Intel Fortran compiler on CPU systems.
This Argonne-Intel co-design approach has led to the software stack’s quick maturation to production quality for execution on Aurora, and benefits many developer teams.