ATConf: auto-tuning high dimensional configuration parameters for big data processing frameworks

Published: 01 Jan 2023, Last Modified: 07 Aug 2024Clust. Comput. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To support various application scenarios, big data processing frameworks (BDPFs) such as Spark usually provide users with a large number of performance-critical configuration parameters. Since manually configuring is both labor-intensive and time-consuming, automatically tuning configurations parameters for BDPFs to achieve better performance has been an urgent need. To simultaneously address the corresponding challenges such as high dimensional configuration space, we propose ATConf-a new black-box approach of automatically tuning the internal and external configuration parameters for BDPFs. Experimental results based on our local distributed Spark cluster show that the best execution time achieved by ATConf is as much as 46.52% less than the default configuration. Besides, compared with the four baselines, ATConf is able to further reduce the relative execution time over default by at least 4.10% under the same constraint of observation times.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview