Understanding Behavioral Metric Learning: A Large-Scale Study on Distracting Reinforcement Learning Environments

Ziyan Luo; Tianwei Ni; Pierre-Luc Bacon; Doina Precup; Xujie Si

Understanding Behavioral Metric Learning: A Large-Scale Study on Distracting Reinforcement Learning Environments

Ziyan Luo, Tianwei Ni, Pierre-Luc Bacon, Doina Precup, Xujie Si

Published: 09 May 2025, Last Modified: 20 Aug 2025RLC 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Behavioral Metrics, Deep Reinforcement Learning, Evaluation, Benchmarking

TL;DR: The benefits of learning behavioral metrics in deep RL are reduced compared to baselines that incorporate key design choices, as reflected in our large-scale study across diverse distracting tasks.

Abstract: A key approach to state abstraction is approximating behavioral metrics (notably, bisimulation metrics) in the observation space and embedding these learned distances in the representation space. While promising for robustness to task-irrelevant noise, as shown in prior work, accurately estimating these metrics remains challenging, requiring various design choices that create gaps between theory and practice. Prior evaluations focus mainly on final returns, leaving the quality of learned metrics and the source of performance gains unclear. To systematically assess how metric learning works in deep reinforcement learning (RL), we evaluate five recent approaches, unified conceptually as isometric embeddings with varying design choices. We benchmark them with baselines across 20 state-based and 14 pixel-based tasks, spanning 370 task configurations with diverse noise settings. Beyond final returns, we introduce the evaluation of a denoising factor to quantify the encoder's ability to filter distractions. To further isolate the effect of metric learning, we propose and evaluate an isolated metric estimation setting, in which the encoder is influenced solely by the metric loss. Finally, we release an open-source, modular codebase to improve reproducibility and support future research on metric learning in deep RL.

Submission Number: 2

Loading