FiRL : Finslerian Reinforcement Learning for Risk-Aware Anisotropic Locomotion

ICLR 2026 Conference Submission12951 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, Finsler Geometry, Risk-sensitive RL, Conditional Value-at-Risk (CVaR), Anisotropic Locomotion, Quasimetric Learning, Robot Locomotion
Abstract: Legged locomotion is inherently anisotropic and risk-sensitive: the energy cost and risk of failure vary significantly with the direction and speed of motion. Standard reinforcement learning (RL) methods neglect this asymmetry, typically using isotropic cost/reward functions and optimizing only for expected returns. This leaves agents vulnerable to rare but catastrophic outcomes. We propose Finslerian Reinforcement Learning (FiRL), a novel RL framework that integrates a Finsler metric into the cost function for directional energy-awareness, and optimizes a Conditional Value-at-Risk ($CVaR_\alpha$) objective for tail-risk robustness. FiRL formulates the locomotion cost as $F(x,v)$, a Finsler metric that varies with state $x$ and motion $v$, capturing uphill vs.\ downhill effort, lateral friction, and other direction-dependent costs. We derive a risk-sensitive Bellman equation based on $CVaR$ and prove that the corresponding CVaR–Finsler Bellman operator is a $\gamma$-contraction, yielding a unique fixed-point value function that induces a quasi-metric structure (satisfying a triangle inequality despite asymmetry). We develop a FiRL actor–critic algorithm to learn policies under this anisotropic, risk-averse objective. In simulated MuJoCo locomotion benchmarks, FiRL achieves safer and more energy-efficient behaviors than SOTA baselines (e.g., risk-neutral PPO). For example, on a $12^\circ$ slope Hopper task FiRL reduces worst-case ($CVaR_{0.1}$) impact forces by over $35%$ and total energy cost by $15%$, while attaining a higher success rate.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 12951
Loading