Benchmarking AutoML Clustering Frameworks

Published: 30 Apr 2024, Last Modified: 26 Aug 2024AutoML 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AutoML, Clustering, Benchmarking, CVIs, Autocluster, ML2DAC, cSmartML, AutoML4Clust, Benchmark, Unsupervised
TL;DR: Benchmarking of AutoML frameworks for unsupervised clustering
Abstract: The surge of frameworks for automated unsupervised clustering problems exposed the notable gap in performance assessment, unified datasets and methodologies for this field. The lack of standardization and proper clustering goal setting obscures the applicability and suitability of such solutions. Therefore, we propose a benchmark to bridge this gap by offering a comparative analysis of AutoML frameworks for clustering, using several criteria and a comprehensive set of benchmarking problems. Four prominent AutoML unsupervised frameworks (AutoML4Clust, Autocluster, cSmartML, and ML2DAC) were compared following our methodology. By extending the evaluation beyond quantitative metrics, this research contributes to a more nuanced understanding of the applicability and performance of AutoML for a diverse set of clustering problems. Our analysis shows the evident demand for effort in the direction of pipeline synthesis (i.e., search and optimization of complete pipelines), clustering goal definition and suitable analysis dimensions.
Submission Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Optional Meta-Data For Green-AutoML: This blue field is just for structuring purposes and cannot be filled.
Evaluation Metrics: Yes
Submission Number: 6
Loading