AUBER: Automated BERT Regularization

Hyun Dong Lee; Seongmin Lee; U Kang

AUBER: Automated BERT Regularization

Hyun Dong Lee, Seongmin Lee, U Kang

28 Sept 2020 (modified: 15 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: BERT Regularization, Reinforcement Learning, Automated Regularization

Abstract: How can we effectively regularize BERT? Although BERT proves its effectiveness in various downstream natural language processing tasks, it often overﬁts when there are only a small number of training instances. A promising direction to regularize BERT is based on pruning its attention heads based on a proxy score for head importance. However, heuristic-based methods are usually suboptimal since they predetermine the order by which attention heads are pruned. In order to overcome such a limitation, we propose AUBER, an effective regularization method that leverages reinforcement learning to automatically prune attention heads from BERT. Instead of depending on heuristics or rule-based policies, AUBER learns a pruning policy that determines which attention heads should or should not be pruned for regularization. Experimental results show that AUBER outperforms existing pruning methods by achieving up to 10% better accuracy. In addition, our ablation study empirically demonstrates the effectiveness of our design choices for AUBER.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We propose a method to automatically regularize BERT to improve its accuracy via reinforcement learning.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/auber-automated-bert-regularization/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=zy5n8FMJg

9 Replies

Loading