LEARNING BILATERAL CLIPPING PARAMETRIC ACTIVATION FUNCTION FOR LOW-BIT NEURAL NETWORKS

Yunlong Ding; Rui Wu; Dirong Chen

LEARNING BILATERAL CLIPPING PARAMETRIC ACTIVATION FUNCTION FOR LOW-BIT NEURAL NETWORKS

Yunlong Ding, Rui Wu, Dirong Chen

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: quantization, activation function, unbounded, full-precision

Abstract: The Rectified Linear Unit (ReLU) is a widely used activation function in deep neural networks, and several works are devoted to designing its variants to improve performance. However, the output is unbounded for most of such functions, which brings severe accuracy degeneration when the full-precision model is quantized. To tackle the problem of unboundedness, Choi etal. (2019) introduce an activation clipping parameter for the standard ReLU. In this paper, we propose a Bilateral Clipping Parametric Rectified Linear Unit (BCPReLU) as a generalized version of ReLU and some variants of ReLU. Specifically, the trainable slope and truncation parameters for both positive and negative input are introduced in BCPReLU . We theoretically prove that BCPReLU has almost the same expressive ability as the corresponding unbounded one, and establish its convergence in low-bit quantization training. Numerical experiments on a range of popular models and datasets verify its effectiveness, which outperforms the state-of-the-art methods.

One-sentence Summary: We proposed a new clipping parametric activation function as a generalized version of ReLU and some variants of ReLU to alleviate the unbounded problem in low-bit quantization training.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=uqD3zLa-0B

5 Replies

Loading