RL-Tune: A Deep Reinforcement Learning Assisted Layer-wise Fine-Tuning Approach for Transfer Learning

Tanvir Mahmud; Natalia Frumkin; Diana Marculescu

RL-Tune: A Deep Reinforcement Learning Assisted Layer-wise Fine-Tuning Approach for Transfer Learning

Tanvir Mahmud, Natalia Frumkin, Diana Marculescu

26 May 2022 (modified: 05 May 2023)ICML 2022 Pre-training WorkshopReaders: Everyone

Keywords: novel fine-tuning, transfer-learning, deep reinforcement learning, learning-rate scheduling

TL;DR: We propose a deep reinforcement learning based novel fine-tuning approach for maximally leveraging a pre-trained model on a target dataset.

Abstract: Data scarcity is one of the major challenges in many real-world applications. To handle low-data regimes, practitioners often take an existing pre-trained network and fine-tune it on a data-deficient target task. In this setup, a network is pre-trained on a source dataset and fine-tuned on a different, potentially smaller, target dataset. We address two critical challenges with transfer learning via fine-tuning: (1) The required amount of fine-tuning greatly depends on the distribution shift from source to target dataset. (2) This distribution shift greatly varies by layer, thereby requiring layer-wise adjustments in fine-tuning to adapt to this distribution shift while preserving the pre-trained network's feature representation. To overcome these challenges, we propose RL-Tune, a layer-wise fine-tuning framework for transfer learning which leverages reinforcement learning to adjust learning rates as a function of the target data shift. In our RL framework, the state is a collection of the intermediate feature activations generated from training samples. The agent generates layer-wise learning rates as actions for fine-tuning based on the current state and obtains sample accuracy as the reward. RL-Tune outperforms other state-of-the-art approaches on standard transfer learning benchmarks by a large margin, e.g., 6% mean accuracy improvement on CUB-200-2011 with 15% data.

0 Replies

Loading