Preparing CP2K for LUMI: User Experiences and Lessons Learned
DescriptionCP2K is an open source, quantum chemistry and molecular dynamics software package comprising of a broad range of computational algorithms and simulation methods that can be flexibly combined to solve a given problem. Additional functionality is provided by GPU accelerated libraries such as COSMA for parallel multiplication of large skinny matrices and DBCSR for sparse matrix multiplications. CP2K has been tuned for running on massively parallel large-scale systems such as LUMI and Frontier. In this talk, we will describe performance optimizations for CP2K benchmarks which take advantage of the key features in HPE servers equipped with 4 AMD MI250X GPUs and 3rd generation AMD EPYC CPU, all interconnected with Infinity Fabric™ links. We focus primarily on two benchmarks, 128-H2O-RPA and H2O-LS-DFT, designed to stress all compute and network components. COSMA was enhanced with GPU-aware MPI and RCCL support, CP2K’s GRID HIP backend was written from scratch to improve wave occupancy and reduce register spills, the Plane Wave (PW) algorithm was adapted to use rocFFT, and DBCSR was enhanced with GPU-aware MPI. In addition, we will discuss how affinity, oversubscription of GPUs and other hardware features helped achieve superior performance. We will cover lessons learned and challenges faced in this process.
TimeTuesday, June 2716:00 - 16:30 CEST
Computer Science, Machine Learning, and Applied Mathematics