BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230831T095746Z
LOCATION:Seehorn
DTSTART;TZID=Europe/Stockholm:20230627T150000
DTEND;TZID=Europe/Stockholm:20230627T153000
UID:submissions.pasc-conference.org_PASC23_sess184_pap129@linklings.com
SUMMARY:A Massively Parallel Multi-Scale FE2 Framework for Multi-Trillion 
 Degrees of Freedom Simulations
DESCRIPTION:Paper\n\nCharles Moulinec (Science and Technology Facilities C
 ouncil); Guillaume Houzeaux, Ricard Borrell, Adria Quintanas, and Guillerm
 o Oyarzun (Barcelona Supercomputing Center); Judicael Grasset (CNRS); and 
 Guido Giuntoli and Mariano Vazquez (Barcelona Supercomputing Center)\n\nTh
 e advent of hybrid CPU and accelerator supercomputers opens the door to ex
 tremely large multi-scale simulations. An example of such a multi-scale te
 chnique, the FE2 approach, has been designed to simulate material deformat
 ions, by getting a better estimation of the material properties, which, in
  effect, reduces the need to introduce physical modelling at macro-scale l
 evel, such as constitutive laws, for instance. Both macro- and micro-scale
 s are solved using the Finite Element method, the micro-scale being resolv
 ed at the Gauss points of the macro-scale mesh. As the micro-scale simulat
 ions do not require any information from each other, and are thus run conc
 urrently, the stated problem is embarrassingly parallel. The FE2 method th
 erefore directly benefits from hybrid machines, the macro- scale being sol
 ved on CPU whereas the micro-scale is offloaded to accelerators. The case 
 of a flat plate, made of different materials is used to illustrate the pot
 ential of the method. In order to ensure good load balance on distributed 
 memory machines, weighting based on the type of materials the plate is mad
 e of is applied by means of a Space Filling Curve technique. Simulations h
 ave been carried out for over 5 trillions of degrees of freedom on up to 2
 ,048 nodes (49,152 CPUs and 12,288 GPUs) of the US DOE Oak Ridge National 
 Laboratory high-end machine, Summit, showing an excellent speed-up for the
  assembly part of the framework, where the micro-scale is computed on GPU 
 using CUDA.\n\nDomain: Engineering, Computer Science, Machine Learning, an
 d Applied Mathematics &#8232;\n\nSession Chair: Nur Aiman Fadel (ETH Zurich / CS
 CS)
END:VEVENT
END:VCALENDAR
