Informing Reinforcement Learning Agents by Grounding Language to Markov Decision Processes

Benjamin Adin Spiegel; Ziyi Yang; William Jurayj; Ben Bachmann; Stefanie Tellex; George Konidaris

Informing Reinforcement Learning Agents by Grounding Language to Markov Decision Processes

Benjamin Adin Spiegel, Ziyi Yang, William Jurayj, Ben Bachmann, Stefanie Tellex, George Konidaris

03 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: language advice, human in the loop, human interation, reinforcement learning

Abstract: Natural language advice has the potential to accelerate reinforcement learning, but utilizing diverse and highly detailed forms of language efficiently remains unsolved. Existing methods focus on mapping natural language to individual elements of MDPs such as reward functions or policies, but such approaches limit the scope of language they consider to make such mappings possible. We propose to leverage language advice by translating sentences to a grounded formal language for expressing information about every element of an MDP and its solution, including policies, plans, reward functions, and transition functions. We also introduce a new model-based reinforcement learning algorithm, RLang-Dyna-Q, capable of leveraging all such advice, and demonstrate in two sets of experiments that grounding language to every element of an MDP leads to significant performance gains. In additional symbol-grounding demonstrations we show how vision-language models can annotate important structure in the environment in the form of RLang vocabulary files, eliminating the need for human labels.

Primary Area: reinforcement learning

Supplementary Material: zip

Submission Number: 1815

Loading