Presentation – PASC 2023

· Contributors · Organizations · Search Program · Happening Now

Data Centric Computing & the Computing Continuum, the IO-SEA Project Proposal

Presenter

DescriptionMore and more High-Performance Computing workflows are based upon data collected in the field and some of them require a careful design in term of data management. For instance, tsunami risk prediction processes data collected from seismic sensors spread around the world. When seismic waves are detected, the workflow must be run as fast as possible to evaluate the tsunami risk and possibly raise an alert. We will present in this talk how the concepts and tools developed in the European funded IO-SEA project can be used to implement such “distributed data centric” workflows. We introduce the concepts of datasets and namespaces to group data into sets that can be manipulated as a whole (moved, copied, archived…). Datasets are made accessible to computing resources through ephemeral I/O services running on dedicated "data nodes" optimized for handling large quantities of data. Users specify which datasets are required to execute their workflow steps, and the runtime environment sets up the ephemeral I/O services accordingly. Users can also control data movement within the storage hierarchy to optimize time-to-solution, keeping frequently accessed data in the fastest storage tiers.

SlidesPDF

TimeMonday, June 2614:30 - 15:00 CEST

LocationSeehorn

SessionMS1E - Data Management across the Computing Continuum

Session Chair

François Tessier

INRIA

Event Type

Minisymposium

Domains

Author

Philippe Couvée

Atos