InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang; Shuohang Wang; Yu Cheng; Zhe Gan; Ruoxi Jia; Bo Li; Jingjing Liu

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Published: 12 Jan 2021, Last Modified: 22 Jun 2025ICLR 2021 PosterReaders: Everyone

Keywords: adversarial robustness, information theory, BERT, adversarial training, NLI, QA

Abstract: Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust ﬁne-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) a Robust Feature regularizer, which increases the mutual information between local robust features and global features. We provide a principled way to theoretically analyze and improve the robustness of representation learning for language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks. Our code is available at https://github.com/AI-secure/InfoBERT.

One-sentence Summary: We propose a novel learning framework, InfoBERT, for robust fine-tuning of pre-trained language models from an information-theoretic perspective, and achieve state-of-the-art robust accuracy over several adversarial datasets on NLI and QA tasks.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Code: [![github](/images/github_icon.svg) AI-secure/InfoBERT](https://github.com/AI-secure/InfoBERT) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=hpH98mK5Puk)

Data: [ANLI](https://paperswithcode.com/dataset/anli), [MultiNLI](https://paperswithcode.com/dataset/multinli), [SNLI](https://paperswithcode.com/dataset/snli)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/infobert-improving-robustness-of-language/code)

12 Replies

Loading