Attention-Informed Surrogates for Navigating Power-Performance Trade-offs in HPC

NeurIPS 2025 Workshop MLForSys Submission63 Authors

Published: 30 Oct 2025, Last Modified: 14 Nov 2025MLForSys2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: High-Performance Computing, Scheduling, Multi-Objective Optimization, Bayesian Optimization, Surrogate Modeling, Attention, Embeddings, Performance Modeling, Runtime–Power Trade-offs
TL;DR: We present a surrogate-assisted multi-objective Bayesian optimization framework that leverages Attention-Informed Surrogates to model runtime–power trade-offs in HPC scheduling.
Abstract: High-Performance Computing (HPC) schedulers must balance user performance with facility-wide resource constraints. The task boils down to selecting the optimal number of nodes for a given job. We present a surrogate-assisted multi-objective Bayesian optimization (MOBO) framework to automate this complex decision. Our core hypothesis is that surrogate models informed by attention-based embeddings of job telemetry can capture performance dynamics more effectively than standard regression techniques. We pair this with an intelligent sample acquisition strategy to ensure the approach is data-efficient. On two production HPC datasets, our embedding-informed method consistently identified higher-quality Pareto fronts of runtime-power trade-offs compared to baselines. Furthermore, our intelligent data sampling strategy drastically reduced training costs while improving the stability of the results. To our knowledge, this is the first work to successfully apply embedding-informed surrogates in a MOBO framework to the HPC scheduling problem, jointly optimizing for performance and power on production workloads.
Submission Number: 63
Loading