Harnessing federated learning for anomaly detection in supercomputer nodes

Published: 01 Jan 2024, Last Modified: 27 Jul 2025Future Gener. Comput. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•High-performance computing (HPC) systems have become crucial in modern society due to their extraordinary computational capacities.•Nonetheless, they are expensive, and their anomalous behavior causes inconvenience to their users.•Optimizing and improving anomaly detection in HPC is a vital area of research.•Federated Learning has improved the average anomaly detection f-score from 0.46 to 0.87.•Federated Learning has also reduced the average duration of training data required to train anomaly detection HPC models from 5 months to 1.125 weeks.
Loading