DBellQuant: Breaking the Bell with Double-Bell Transformation for LLM Post Training Binarization

Zijian Ye; Wei Huang; Yifei Yu; Tianhe Ren; Zhongrui Wang; Xiaojuan Qi

DBellQuant: Breaking the Bell with Double-Bell Transformation for LLM Post Training Binarization

Zijian Ye, Wei Huang, Yifei Yu, Tianhe Ren, Zhongrui Wang, Xiaojuan Qi

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM quantization

Abstract: Large language models (LLMs) demonstrate remarkable performance but face substantial computational and memory challenges that limit their practical deployment. Quantization has emerged as a promising solution; however, its effectiveness is often limited by quantization errors arising from weight distributions that are not quantization-friendly and the presence of activation outliers. To address these challenges, we introduce DBellQuant, an innovative post-training quantization (PTQ) framework that achieves nearly 1-bit weight compression and 6-bit activation quantization with minimal performance degradation. DBellQuant uses learnable transformation to map single-bell weight distribution to dual-bell distribution to reduce binarization error and smooth activations using inverse transformation. DBellQuant sets a new state-of-the-art by preserving superior model performance under aggressive weight and activation quantization. For example, on the Wikitext2 dataset, DBellQuant achieves a perplexity of 14.39 on LLaMA2-13B with nearly 1-bit weight and 6-bit activation quantization, significantly outperforming BiLLM’s 21.35 without activation quantization, underscoring its potential in compressing LLMs for real-world edge applications.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 11987

Loading