GPU Memory Management for Deep Neural Networks Using Deep Q-Network

Shicheng Chen

GPU Memory Management for Deep Neural Networks Using Deep Q-Network

Shicheng Chen

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: We propose a reinforcement learning based variable swapping and recomputation algorithm to reduce the memory cost.

Abstract: Deep neural networks use deeper and broader structures to achieve better performance and consequently, use increasingly more GPU memory as well. However, limited GPU memory restricts many potential designs of neural networks. In this paper, we propose a reinforcement learning based variable swapping and recomputation algorithm to reduce the memory cost, without sacrificing the accuracy of models. Variable swapping can transfer variables between CPU and GPU memory to reduce variables stored in GPU memory. Recomputation can trade time for space by removing some feature maps during forward propagation. Forward functions are executed once again to get the feature maps before reuse. However, how to automatically decide which variables to be swapped or recomputed remains a challenging problem. To address this issue, we propose to use a deep Q-network(DQN) to make plans. By combining variable swapping and recomputation, our results outperform several well-known benchmarks.

Keywords: GPU memory management, deep reinforcement learning, neural networks

Original Pdf: pdf

4 Replies

Loading