Realizing Joint Extreme-Scale Simulations on Multiple Supercomputers—Two Superfacility Case Studies

Theresa Pollinger, Alexander Van Craen, Philipp Offenhäuser, Dirk Pflüger

Published: 17 Nov 2024, Last Modified: 09 Nov 2025CrossrefEveryoneRevisionsCC BY-SA 4.0
Abstract: High-dimensional grid-based simulations serve as both a tool and a challenge in researching various domains. The main challenge of these approaches is the well-known curse of dimensionality, amplified by the need for fine resolutions in high-fidelity applications. The combination technique (CT) provides a straightforward way of performing such simulations while alleviating the curse of dimensionality. Recent work demonstrated the potential of the CT to join multiple systems simultaneously to perform a single high-dimensional simulation. This paper shows how to extend this to three or more systems and addresses some remaining challenges: load balancing on heterogeneous hardware; utilizing compression to maximize the communication bandwidth; efficient I/O management through hardware mapping; and improving memory utilization through algorithmic optimizations. Combining these contributions, we demonstrate the feasibility of the CT for extreme-scale Superfacility scenarios of 46 trillion DOF on two systems and 35 trillion DOF on three systems. Scenarios at these resolutions would be intractable with full-grid solvers ($\gt1,000$ nonillion DOF each).
Loading