everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Continual Reinforcement Learning (CRL) is a powerful tool that enables agents to learn a sequence of tasks, accumulating knowledge learned in the past and using it for problemsolving or future task learning. However, existing CRL methods all assume that the agent’s capabilities remain static within dynamic environments, which doesn’t reflect realworld scenarios where capabilities evolve. This paper introduces Self-Evolution Continual Reinforcement Learning (SE-CRL), a new and realistic problem where the agent’s action space continually changes. It presents a significant challenge for RL agents: How can policy generalization across different action spaces be achieved? Inspired by the cortical functions that lead to consistent human behavior, we propose an Action Representation Continual Reinforcement Learning framework (ARC-RL) to address this challenge. Our framework builds a representation space for actions by self-supervised learning on transitions, decoupling the agent’s policy from the specific action space. For a new action space, the decoder of the action representation is expanded or masked for adaptation and regularized fine-tuned to improve the stability of the policy. Furthermore, we release a benchmark based on MiniGrid to validate the effectiveness of methods for SE-CRL. Experimental results demonstrate that our framework significantly outperforms popular CRL methods by generalizing the policy across different action spaces.