AdaPQ: Adaptive Exploration Product Quantization with Adversary-Aware Block Size Selection Toward Compression Efficiency

Yan-Ting Ye; Ting-An Chen; Ming-Syan Chen

AdaPQ: Adaptive Exploration Product Quantization with Adversary-Aware Block Size Selection Toward Compression Efficiency

Yan-Ting Ye, Ting-An Chen, Ming-Syan Chen

Published: 01 Jan 2024, Last Modified: 26 Jan 2025PAKDD (2) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Product Quantization (PQ) has received an increasing research attention due to the effectiveness on bit-width compression for memory efficiency. PQ is developed to divide weight values into blocks and adopt clustering algorithms dividing them into groups assigned with quantized values accordingly. Existing research mainly focused on the clustering strategy design with a minimal error between the original weights and the quantized values for the performance maintenance. However, the block division, i.e., the selection of block size, determines the choices of number of clusters and compression rate which has not been fully studied. Therefore, this paper proposes a novel scheme AdaPQ with the process, Adaptive Exploration Product Quantization, to first flexibly construct varying block sizes by padding the filter weights, which enlarges the search space of quantization result of PQ and avoids being suffered from a sub-optimal solution. Afterward, we further design a strategy, Adversary-aware Block Size Selection, to select an appropriate block size for each layer by evaluating the sensitivity on performance under perturbation for obtaining a minor performance loss under a high compression rate. Experimental results show that AdaPQ achieves a higher accuracy under a similar compression rate compared with the state-of-the-art.

Loading