Automated Optimization of Deep Neural Networks: Dynamic Bit-Width and Layer-Width Selection via Cluster-Based Parzen Estimation

Published: 01 Jan 2024, Last Modified: 17 Feb 2025DATE 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Given the ever-growing complexity and computational requirements of deep learning models, it has become imperative to efficiently optimize neural network architectures. This paper presents an automated, search-based method for opti-mizing the bit-width and layer-width of individual neural network layers, achieving substantial reductions in the size and processing requirements of these models. Our unique approach employs Hessian-based search space pruning to discard unpromising so-lutions, greatly reducing the search space. We further refine the optimization process by introducing a novel, adaptive algorithm that combines k-means clustering with tree-structured Parzen estimation. This allows us to dynamically adjust a figure of merit used in tree-structured Parzen estimation, i.e., the desirability of a particular bit-width and layer-width configuration, thereby expediting the identification of optimal configurations. Through extensive experiments on benchmark datasets, we validate the efficacy of our method. More precisely, our method outperforms existing techniques by achieving an average 20 % reduction in model size without sacrificing any output accuracy. It also boasts a <tex>$12\times$</tex> acceleration in search time compared to the most advanced search-based approaches.
Loading