Why pre-training is beneficial for downstream classification tasks?

13 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: pre-training, explainable AI
TL;DR: This paper provides insightful explanations for the benefits of pre-training to downstream tasks from a game-theoretic view.
Abstract: It is widely acknowledged that pre-training brings benefits to downstream tasks by boosting accuracy and speeding up convergence, but the exact reasons for these two benefits still remain unclear. To this end, we propose to quantitatively and accurately explain effects of pre-training on the downstream task from a novel game-theoretic view, which also sheds new light into the learning behavior of deep neural networks (DNNs). Specifically, we extract and quantify the knowledge encoded by the pre-trained model, and further track the changes of such knowledge during the fine-tuning process. Interestingly, we discover that only a limited amount of pre-trained model's knowledge is preserved for the inference of downstream tasks, and such preserved knowledge is very difficult for a model training from scratch to learn. Thus, with the help of this exclusively learned and useful knowledge, the fine-tuned model usually achieves better performance. Besides, we discover that pre-training can guide the fine-tuned model to learn target knowledge of the downstream task more directly and quickly than the model training from scratch, which accounts for the faster convergence of the fine-tuned model. The code will be released when the paper is accepted.
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 259
Loading