GRIM: Task-Oriented Grasping with Conditioning on Generative Examples

Published: 01 Jun 2025, Last Modified: 23 Jun 2025OOD Workshop @ RSS2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Task-Oriented Grasping, Training-Free, Retrieval, Semantic Alignment, Generative examples
TL;DR: GRIM is a training-free framework for task-oriented grasping that aligns objects using geometric and DINO features, then transfers and refines grasp poses for task compatibility with minimal examples.
Abstract: Task-Oriented Grasping (TOG) presents a significant challenge, requiring a nuanced understanding of task semantics, object affordances, and the functional constraints dictating how an object should be grasped for a specific task. To address these challenges, we introduce GRIM (Grasp Re-alignment via Iterative Matching), a novel training-free framework for task-oriented grasping. Initially, a coarse alignment strategy is developed using a combination of geometric cues and principal component analysis (PCA)-reduced DINO features for similarity scoring. Subsequently, the full grasp pose associated with the retrieved memory instance is transferred to the aligned scene object and further refined against a set of task-agnostic, geometrically stable grasps generated for the scene object, prioritizing task compatibility. In contrast to existing learning-based methods, GRIM demonstrates strong generalization capabilities, achieving robust performance with only a small number of conditioning examples.
Supplementary Material: zip
Submission Number: 24
Loading