PixL2R: Guiding Reinforcement Learning using Natural Language by Mapping Pixels to Rewards

Prasoon Goyal; Scott Niekum; Ray Mooney

PixL2R: Guiding Reinforcement Learning using Natural Language by Mapping Pixels to Rewards

Prasoon Goyal, Scott Niekum, Ray Mooney

Published: 17 Jul 2020, Last Modified: 13 Apr 2025LaReL 2020Readers: Everyone

Abstract: Reinforcement learning (RL), particularly in sparse reward settings, often requires prohibitively large numbers of interactions with the environment, thereby limiting its applicability to complex problems. To address this, several prior approaches have used natural language to guide the agent's exploration. However, these approaches typically operate on structured representations of the environment, and/or assume some structure in the natural language commands. In this work, we propose a model that directly maps pixels to rewards, given a free-form natural language description of the task, which can then be used for policy training. Our experiments on the Meta-World robot manipulation domain show that language-based rewards significantly improve learning. Further, we analyze the resulting framework using multiple ablation experiments to better understand the nature of these improvements.

TL;DR: We propose an approach for mapping pixels to rewards, conditioned on a free-form natural language description of the task, which can then be used to improve the sample efficiency of reinforcement learning..

Keywords: reinforcement learning, language, reward shaping

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/pixl2r-guiding-reinforcement-learning-using/code)

1 Reply

Loading