Keywords: Fine-tuning, Information Gain, Data filtering, Transformers
TL;DR: Information Gain Filtration is a novel method to filter informative data samples and improve fine-tuning of models
Abstract: Language model fine-tuning is essential for modern natural language processing. The effectiveness of fine-tuning is limited by the inclusion of training examples that negatively affect performance. Here we present Information Gain Filtration, a general fine-tuning method, for improving the overall final performance of a fine-tuned model. We define Information Gain of an example as the improvement on a validation metric after training on that example. A secondary learner is then trained to approximate this quantity. During fine-tuning, this learner filters informative examples from uninformative ones. We show that our method is robust and has consistent improvement across datasets, fine-tuning tasks, and language model architectures.
Submission Type: Non-archival
Volunteer As A Reviewer: Yes