Abstract: Reinforcement learning (RL) has been actively re-searched and has shown good performance in various tasks. However, most RL agents solve their tasks by exploring the environment with only primitive actions, which is an inefficient and difficult way to solve complex tasks. A method of decomposing complex tasks into sub-tasks and solving them step-by-step has been studied as a solution, but most of these works learn sub-tasks that do not consider the goals of downstream tasks, making it challenging to find the optimal sub-task combination. In this paper, we propose a framework for solving complex tasks by learning to generate goal-conditioned sub-goals that leads to reaching a goal state. First, we introduce a deep latent variant model that learns the set of goal-conditioned sub-goals from datasets. We then propose an algorithm with a hierarchical structure to solve complex problems using learned sub-goals. The proposed model consists of two level policies, where the higher-level policy selects the optimal sub-task to solve the problem, and the lower-level policy outputs primitive actions to reach each sub-goal. To verify the effectiveness of our method, we compared our framework with other baselines in a long-horizon maze environment.
Loading