The Role of Forgetting in Fine-Tuning Reinforcement Learning Models

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: reinforcement learning, transfer learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successful applications of foundation models. However, fine-tuning pre-trained reinforcement learning (RL) agents remains a challenge. This work conceptualizes one specific cause of poor transfers in the RL setting: *forgetting of pre-trained capabilities*. Namely, due to the distribution shift between the pre-training and fine-tuning data, the pre-trained model can significantly deteriorate before the agent reaches parts of the state space known by the pre-trained policy. In many cases, re-learning the lost capabilities takes as much time as learning them from scratch. We identify conditions when this problem occurs, perform a thorough analysis, and identify potential solutions. Namely, we propose to counteract deterioration by applying techniques that mitigate forgetting. We experimentally confirm this to be an efficient solution; for example, it allows us to significantly improve the fine-tuning process on Montezuma's Revenge as well as on the challenging NetHack domain.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7158
Loading