Cost-Effective Methodology for Complex Tuning Searches in HPC: Navigating Interdependencies and Dimensionality
Abstract: Tuning searches in High-Performance Computing (HPC) are challenged not only by the need to finely tune parameters in application routines but also by considering their potential interdependencies, which renders traditional optimization methods inefficient. Instead of scrutinizing interdependencies among parameters and routines, practitioners often face a dilemma: either conduct independent tuning searches for each routine overlooking interdependence or pursue a more resource-intensive joint search for all routines. This decision is influenced by the computational cost associated with some interdependence analysis techniques in the literature. Our methodology adapts and refines these methods to ensure computational feasibility in real-world scenarios. It leverages a cost-effective interdependence analysis to decide whether to merge several tuning searches into a joint search or conduct orthogonal searches. When tested on synthetic functions with varying levels of parameter interdependence, our methodology efficiently explores the search space. In comparison to Bayesian-optimization-based fully joint searches, it reduces the search time by up to 95%; compared to full-independent searches, it leads to final configurations up to 8% more accurate. When applied to GPU-offloaded Real-Time Time-Dependent Density Functional Theory (RT-TDDFT), an application in computational materials science that challenges modern HPC autotuners, our methodology achieved an effective tuning search. Its adaptability and efficiency extend beyond RT-TDDFT, making it valuable for related applications in HPC.
Loading