AI4HPC: Library to Train AI Models on HPC Systems using CFD Datasets

Published: 28 Oct 2023, Last Modified: 14 Nov 2023WANT@NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: artificial intelligence, deep neural network, distributed training, computational fluid dynamics, high-performance computing
TL;DR: An open-source library to train AI models on HPC systems for data-driven use cases with enourmous datasets.
Abstract: This paper introduces AI4HPC, an open-source library designed to integrate Artificial Intelligence (AI) models and workflows in High-Performance Computing (HPC) systems for Computational Fluid Dynamics (CFD)-based applications. Developed by CoE RAISE, AI4HPC addresses not only challenges in handling intricate CFD datasets, model complexity, and scalability but also includes extensive code optimizations to improve performance. Furthermore, the library encompasses data manipulation, specialized ML architectures, distributed training, hyperparameter optimization, and performance monitoring. Integrating AI and CFD in AI4HPC empowers efficient analysis of extensive and large-scale datasets. This paper outlines the architecture, components, and potential of AI4HPC to accelerate and augment data-driven fluid dynamics simulations and beyond, demonstrated by showing the scaling results of this library up to 3,664 GPUs.
Submission Number: 10