BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230831T095746Z
LOCATION:Flüela
DTSTART;TZID=Europe/Stockholm:20230627T160000
DTEND;TZID=Europe/Stockholm:20230627T163000
UID:submissions.pasc-conference.org_PASC23_sess129_msa115@linklings.com
SUMMARY:Preparing CP2K for LUMI: User Experiences and Lessons Learned
DESCRIPTION:Minisymposium\n\nMathieu Taillefumier (ETH Zurich / CSCS), Gin
 a Sitaraman (AMD), Alfio Lazzaro (HPE), Leopold Grinberg (AMD), and Marko 
 Kabic (ETH Zurich)\n\nCP2K is an open source, quantum chemistry and molecu
 lar dynamics software package comprising of a broad range of computational
  algorithms and simulation methods that can be flexibly combined to solve 
 a given problem. Additional functionality is provided by GPU accelerated l
 ibraries such as COSMA for parallel multiplication of large skinny matrice
 s and DBCSR for sparse matrix multiplications. CP2K has been tuned for run
 ning on massively parallel large-scale systems such as LUMI and Frontier. 
 In this talk, we will describe performance optimizations for CP2K benchmar
 ks which take advantage of the key features in HPE servers equipped with 4
  AMD MI250X GPUs and 3rd generation AMD EPYC CPU, all interconnected with 
 Infinity Fabric™ links. We focus primarily on two benchmarks, 128-H2O-RPA 
 and H2O-LS-DFT, designed to stress all compute and network components. COS
 MA was enhanced with GPU-aware MPI and RCCL support, CP2K’s GRID HIP backe
 nd was written from scratch to improve wave occupancy and reduce register 
 spills, the Plane Wave (PW) algorithm was adapted to use rocFFT, and DBCSR
  was enhanced with GPU-aware MPI. In addition, we will discuss how affinit
 y, oversubscription of GPUs and other hardware features helped achieve sup
 erior performance. We will cover lessons learned and challenges faced in t
 his process.\n\nDomain: Computer Science, Machine Learning, and Applied Ma
 thematics &#8232;\n\nSession Chair: Aniello Esposito (HPE)
END:VEVENT
END:VCALENDAR
