Hercules: Scalable and Network Portable In-Memory Ad-Hoc File System for Data-Centric and High-Performance Applications

Published: 01 Jan 2023, Last Modified: 08 May 2024Euro-Par 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The growing demands for data processing by new data-intensive applications are putting pressure on the performance and capacity of HPC storage systems. The advancement in storage technologies, such as NVMe and persistent memory, are aimed at meeting these demands. However, relying solely on ultra-fast storage devices is not cost-effective, leading to the need for multi-tier storage hierarchies to move data based on its usage. To address this issue, ad-hoc file systems have been proposed as a solution. They utilise the available storage of compute nodes, such as memory and persistent storage, to create a temporary file system that adapts to the application behaviour in the HPC environment. This work presents the design, implementation, and evaluation of a distributed ad-hoc in-memory storage system (Hercules), highlighting the new communication model included in Hercules. This communication model takes advantage of the Unified Communication X framework (UCX). This solution leverages the capabilities of RDMA protocols, including Infiniband, Onmipath, shared memory, and zero-copy transfers. The preliminary evaluation results show excellent network utilisation compared with other existing technologies.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview