Enhancing Fine-Tuning in Low Data Regime by Increasing Representation Entropy During Pre-Training Phase

Jaeill Kim, Jungwook Shin, Wonjong Rhee

Published: 01 Jan 2023, Last Modified: 01 Oct 2024ICTC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The strategy of training a new model through the fine-tuning of pre-trained model has gained considerable prominence due to its potential to reduce training costs or enhance performance. However, accomplishing both of these objectives concurrently presents a noteworthy challenge. To address the challenge, this study adopts two entropy metrics, which can be seamlessly integrated into the cross entropy loss. Our approach aims at increasing representation entropy during pre-training phase, leading to an increased information encoded in the representation of a pre-trained model. This increased information can then be effectively harnessed during subsequent fine-tuning phase. In experiments, we substantiate the reliability of the two adopted entropy metrics in accurately quantifying representation entropy. Moreover, we demonstrate the effectiveness of the two metrics in enhancing the performance of fine-tuning, particularly in low data regime.