Keywords: Reinforcement Learning, Failure Detection
TL;DR: A method to identify and quantify failure modes in robotic policies
Abstract: Robot manipulation policies often fail for unknown reasons, posing significant hurdles for safe real-world deployment. Existing diagnostic approaches rely on hand-crafted rules and exhaustive scenario enumeration, which are labor-intensive and prone to missing critical failure modes (FM). Reinforcement learning (RL) offers a systematic way to explore high-dimensional environmental variations by optimizing policies toward failure-inducing conditions. We present Robot Manipulation Diagnosis (RoboMD), a policy-agnostic framework that leverages deep RL over a learned vision–language embedding to efficiently search the failure space. Rather than manually specifying each scenario, RoboMD produces a probabilistic failure-likelihood map, highlighting high-risk conditions without exhaustive enumeration. Evaluations on diverse simulation benchmarks and a robot arm show that RoboMD discovers up to 23\% more FM than state-of-the-art vision–language models and uncovers subtle vulnerabilities missed by heuristic methods. By mapping failures in embedding space, RoboMD provides actionable insights to guide targeted policy improvements and enhance robustness in unseen environments.
Submission Number: 12
Loading