Reward Translation via Reward Machine in Semi-Alignable MDPs

Yun Hua; Haosheng Chen; Wenhao Li; Bo Jin; Baoxiang Wang; Hongyuan Zha; Xiangfeng Wang

Reward Translation via Reward Machine in Semi-Alignable MDPs

Yun Hua, Haosheng Chen, Wenhao Li, Bo Jin, Baoxiang Wang, Hongyuan Zha, Xiangfeng Wang

Published: 01 May 2025, Last Modified: 14 Aug 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Addressing reward design complexities in deep reinforcement learning is facilitated by knowledge transfer across different domains. To this end, we define \textit{reward translation} to describe the cross-domain reward transfer problem. However, current methods struggle with non-pairable and non-time-alignable incompatible MDPs. This paper presents an adaptable reward translation framework \textit{neural reward translation} featuring \textit{semi-alignable MDPs}, which allows efficient reward translation under relaxed constraints while handling the intricacies of incompatible MDPs. Given the inherent difficulty of directly mapping semi-alignable MDPs and transferring rewards, we introduce an indirect mapping method through reward machines, created using limited human input or LLM-based automated learning. Graph-matching techniques establish links between reward machines from distinct environments, thus enabling cross-domain reward translation within semi-alignable MDP settings. This broadens the applicability of DRL across multiple domains. Experiments substantiate our approach's effectiveness in tasks under environments with semi-alignable MDPs.

Lay Summary: Designing effective reward signals for artificial intelligence (AI) agents to learn complex tasks—such as robots grasping objects or navigating environments—is challenging and time-intensive. Ideally, we would reuse the reward structures already developed for one task to help train agents in different tasks. However, transferring reward signals between tasks often fails because tasks differ too greatly, making direct comparisons difficult or impossible. To solve this, we introduce a new method called neural reward translation. Our approach does not require tasks to match perfectly in terms of time or actions—tasks only need partial similarity (what we call "semi-alignable"). Instead of directly mapping rewards between tasks, we first represent rewards using simplified structures called "reward machines," which can be created with minimal human guidance or automatically through language-based AI tools. We then use graph-matching techniques—essentially, methods that find similarities between structured information—to connect these reward machines from different tasks. Tests show our method successfully transfers reward knowledge between tasks that differ substantially. This approach makes it easier and faster to train AI across many diverse, real-world scenarios.

Primary Area: Reinforcement Learning

Keywords: Reinforcement Learning, Reward Shaping

Submission Number: 3803

Loading