Exploration in Continuous Control Tasks via Continually Parameterized Skills

Michael Dann, Fabio Zambetta, John Thangarajah

2018 (modified: 20 Apr 2023)IEEE Trans. Games 2018Readers: Everyone

Abstract: Applications of reinforcement learning to continuous control tasks often rely on a steady, informative reward signal. In videogames, however, tasks may be far easier to specify through a binary reward that indicates success or failure. In the absence of a steady, guiding reward, the agent may struggle to explore efficiently, especially if effective exploration requires strong coordination between actions. In this paper, we show empirically that this issue may be mitigated by exploring over an abstract action set, using hierarchically composed parameterized skills. We experiment in two tasks with sparse rewards in a continuous control environment based on the arcade game Asteroids. Compared to a flat learner that explores symmetrically over low-level actions, our agent explores a greater variety of useful actions, and its long-term performance on both tasks is superior.

0 Replies