Wave phenomena as they arise from conservation laws are omnipresent in computational
sciences, and codes simulating them typically ask for enormous compute power.
However, few mathematicians
and modellers have code at hand that allows them to evaluate their ideas straightforwardly on
peta- and exascale machines, many wave equation solvers do not a fit to heterogeneous
(GPU) hardware, and many wave simulations will require exascale capabilities from time
to time, yet not 24/7.
The community thus runs risk to fall into a sophistication gap,
where the scaling software does not incorporate the latest numerical and
algorithmic research, while the latest models and numerics are not scaled up.
It runs risk to fall into a heterogeneity gap, where the particular
hardware configuration that drives exascale is not appropriately supported by
the software.
It runs risk to fall into an economic disproportionality gap, where compute centres struggle
to make the case to grant a project full machine access as its code base cannot exploit the machine efficiently.
We propose to extend the FETHPC H2020 code ExaHyPE into a software called ExaClaw
which tackles these risks.
ExaClaw will couple the leading grid-based toolbox to model wave phenomena,
ClawPack, to the scaling, high-performance ADER-DG AMR engine ExaHyPE,
will be able to deploy compute-intense calculations to GPUs, and the team behind
ExaClaw will prototype a new supercomputer usage scheme well-suited to accommodate
bursts of extreme compute hunger.
These activities pair up with community building and the release of three ExaClaw demonstrators.
This makes ExaClaw a high-profile ExCALIBUR use case.
ExaHyPE is an engine to write solvers for grid-based, first-order
hyperbolic PDE equations.
It supports block-structured Finite Volume schemes and ADER-DG, and it realises a clear
separation of concern to support any application domain.
User feed application domain knowledge such as
flux functions, eigenvalues, initial values or refinement criteria into the engine.
The engine then runs and orchestrates the actual computation.
Mesh traversal, refinement, parallelisation, load balancing, limiting, and so forth
all are hidden from the user.
Internally, the code employs a small set of premanufactured Riemann solvers.
They can be replaced by custom user implementations.
To widen the engine's applicability and productivity, ExaClaw will supplement
ExaHyPE's Riemann solvers with solvers from the ClawPack suite.
ClawPack is the biggest open source repository for explicit wave equation system
solvers, and it comprises a huge variety of well-studied, bespoke Riemann solvers for various
application domains.
ExaHyPE realises a task decomposition where one particular task type dominates the runtime.
This type exhibits a high arithmetic intensity and
will be deployed to GPUs through various
technologies (OpenMP, OpenACC, OneAPI).
Instead of GPUs as workhorse slaves, ExaClaw's GPUs steal their jobs actively from the
compute nodes, i.e.~they are in charge of their own workload.
This establishes the notion of a skeleton hardware, where GPUs or other accelerators
can be dynamically added or removed to a supercomputer run, and code inherently fits
to different hardware configurations.
Finally, ExaClaw will investigate into a novel HPC usage scheme where the load balancing
minimises the number of used machine nodes.
If the workload of a run however becomes massive (due to adaptive mesh refinement, e.g.),
ExaClaw will be able to book further resources dynamically.
The project abandons a static hardware-to-run association and allows multiple
simulations to argue with each other which one should have the biggest share of resources.
Simulations thus can have (close to) full machine access when they need it, but release
resources whenever their demand decreases again.
|