Task-Relevant Depth Quality Metrics for Suction Grasping

Published: 27 May 2026, Last Modified: 04 Jun 2026FMEA @ CVPR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: depth foundation models, embodied manipulation, task-relevant metrics, suction grasping, contact mechanics, depth evaluation, benchmarking
TL;DR: We propose physics-grounded depth quality metrics for embodied suction grasping and show that depth foundation models with worse RMSE can produce better geometry for grasp success.
Abstract: Depth foundation models are increasingly used as a perception backbone for embodied manipulation, and are typically evaluated with metrics that measure global accuracy (RMSE, MAE, AbsRel) but fail to capture the local geometric properties that determine suction grasp success. These properties include surface planarity within the contact patch, surface normal accuracy at grasp points, and contact patch completeness near object boundaries. We propose four task-relevant depth quality metrics grounded in suction contact mechanics and evaluate two depth foundation models (Depth Anything V2, Marigold) against raw structured-light sensor depth on 1,200 images from the GraspNet-1Billion dataset, using synthetic ground-truth depth rendered from object meshes. Our results reveal a consistent rank reversal: the raw depth sensor achieves two to three times better RMSE than the foundation models, yet scores worse than at least one foundation model on every task-relevant metric. The foundation models produce geometrically coherent surfaces (smooth, complete, with consistent normals) despite worse metric accuracy, and suction grasping rewards coherence over accuracy. This suggests that standard metrics can mislead practitioners selecting depth backbones for embodied manipulation, and that hybrid pipelines using sensor depth for positioning and foundation-model depth for grasp evaluation and final approach may be beneficial. An earlier version was accepted to the ICRA 2026 ASAB and Rigorous Perception workshops.
Submission Number: 54
Loading