Keywords: Deep Reinforcement Learning, Control
TL;DR: This paper proposes a new deep reinforcement learning algorithm that can be easily adjusted to achieve new short-term goals without retraining the network.
Abstract: Deep Reinforcement Learning (RL) algorithms can learn complex policies to optimize
agent operation over time. RL algorithms have shown promising results
in solving complicated problems in recent years. However, their application on
real-world physical systems remains limited. Despite the advancements in RL
algorithms, the industries often prefer traditional control strategies. Traditional
methods are simple, computationally efficient and easy to adjust. In this paper,
we propose a new Q-learning algorithm for continuous action space, which can
bridge the control and RL algorithms and bring us the best of both worlds. Our
method can learn complex policies to achieve long-term goals and at the same time
it can be easily adjusted to address short-term requirements without retraining.
We achieve this by modeling both short-term and long-term prediction models.
The short-term prediction model represents the estimation of the system dynamic
while the long-term prediction model represents the Q-value. The case studies
demonstrate that our proposed method can achieve short-term and long-term goals
without complex reward functions.
Original Pdf: pdf
8 Replies
Loading