Data-Centric Python: Bridging Productivity and Performance via Data Movement Minimization
DescriptionComputational scientists are migrating towards high-productivity languages for rapid prototyping and reproducible experiment sharing. Specifically, Python is becoming the language of choice for several fields, partly driven by the attention from the Machine Learning community. However, productivity often clashes with performance, as the (usually ML-specific) frameworks are not geared towards the needs of large-scale scientific computing. In this talk, we will characterize the performance/productivity gap in Python and discuss how to address it, all while retaining the high-level semantics of the language and leveraging embedded DSLs such as NumPy. The talk will review the barriers that inhibit optimizing Python code, define a subset thereof that enables its compilation, and discuss how to deal with the remainder of the code. We will then show how the compilable subset, called Data-Centric (DaCe) Python, can be subject to both local and global optimization via data movement minimization. As a case study, we will review the FV3 climate model and a recent porting of its full dynamical core to GPUs using a combination of the GT4Py DSL and DaCe Python.
TimeMonday, June 2616:30 - 17:00 CEST
Computer Science, Machine Learning, and Applied Mathematics