Improving Language Model Fine-tuning with Information Gain Filtration

Javier S. Turek; Richard Antonello; Nicole Marie Beckage; Alexander Huth

Improving Language Model Fine-tuning with Information Gain Filtration

Javier S. Turek, Richard Antonello, Nicole Marie Beckage, Alexander Huth

27 May 2022 (modified: 05 May 2023)LXNLP 2022 MinorrevisionsReaders: Everyone

Keywords: Fine-tuning, Information Gain, Data filtering, Transformers

TL;DR: Information Gain Filtration is a novel method to filter informative data samples and improve fine-tuning of models

Abstract: Language model fine-tuning is essential for modern natural language processing. The effectiveness of fine-tuning is limited by the inclusion of training examples that negatively affect performance. Here we present Information Gain Filtration, a general fine-tuning method, for improving the overall final performance of a fine-tuned model. We define Information Gain of an example as the improvement on a validation metric after training on that example. A secondary learner is then trained to approximate this quantity. During fine-tuning, this learner filters informative examples from uninformative ones. We show that our method is robust and has consistent improvement across datasets, fine-tuning tasks, and language model architectures.

Submission Type: Non-archival

Volunteer As A Reviewer: Yes

0 Replies

Loading