Goal Randomization for Playing Text-based Games without a Reward FunctionDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Text-based games, Deep reinforcement learning, Generalization
Abstract: Playing text-based games requires language understanding and sequential decision making. The objective of a reinforcement learning agent is to behave so as to maximise the sum of a suitable scalar reward function. In contrast to current RL methods, humans are able to learn new skills with little or no reward by using various forms of intrinsic motivation. We propose a goal randomization method that uses random basic goals to train a policy in the absence of the reward of environments. Specifically, through simple but effective goal generation, our method learns to continuously propose challenging -- yet temporal and achievable -- goals that allow the agent to learn general skills for acting in a new environment, independent of the task to be solved. In a variety of text-based games, we show that this simple method results in competitive performance for agents. We also show that our method can learn policies that generalize across different text-based games. In further, we demonstrate an interesting result that our method works better than one of state-of-the-art agents GATA, which uses environment rewards for some text-based games.
One-sentence Summary: We propose a new method goal randomization for playing text-based games in the absence of the reward of environments.
10 Replies

Loading