Integration of Efficient Deep Q-Network Techniques Into QT-Opt Reinforcement Learning Structure

Jan R. Seyler, Chenxing Li, Shahram Eivazi, Shudao Wei

Published: 22 Feb 2022, Last Modified: 12 Sept 2024ICAART 2023EveryoneCC BY 4.0

Abstract: There has been a growing interest in the development of offline reinforcement learning (RL) algorithms for real-world applications. For example, offline algorithms like qt-opt has demonstrated an impressive performance in grasping task. The primary motivation is to avoid the challenges associated with online data collection. However, these algorithms require extremely large dataset as well as huge computational resources. In this paper we investigate the applicability of well known improvement techniques from Deep Q-learning (DQN) methods to the QT-Opt offline algorithm, for both on-policy and mixed-policy training. For the first time, we show that prioritized experience replay(PER), noisy network, and distributional DQN can be used within QT-Opt framework. As result,for example, in a reacher environment from Pybullet simulation, we observe an obvious improvements in the learning process for the integrated techniques.