FLASH: Flow-Based Language-Annotated Grasp Synthesis for Dexterous Hands

Hrishit Leen; Jeremy A. Collins; Kunal Aneja; Nhi Nguyen; Priyadarshini Tamilselvan; Sri Siddarth Chakaravarthy P; Animesh Garg

FLASH: Flow-Based Language-Annotated Grasp Synthesis for Dexterous Hands

Hrishit Leen, Jeremy A. Collins, Kunal Aneja, Nhi Nguyen, Priyadarshini Tamilselvan, Sri Siddarth Chakaravarthy P, Animesh Garg

Published: 19 Sept 2025, Last Modified: 19 Sept 2025CoRL 2025 Workshop Dexterous Manipulation SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Dexterous Grasping, Flow Matching, Large Language Models

TL;DR: We introduce FLASH, a method for language-conditioned dexterous grasping that jointly models task intent and physical contact quality for robot hands.

Abstract: We introduce FLASH, a method for language-conditioned dexterous grasping that jointly models task intent and physical contact quality for robot hands. Unlike prior approaches, our text-conditioned grasp synthesis pipeline is explicitly aware of geometric information during generation. FLASH learns a single flow-matching model conditioned on hand and object point clouds and natural language instructions. Our model operates on live-updated, vectorized hand meshes and is trained on our improved grasp dataset, FLASH-drive, which includes refined grasps, water-tight meshes and augmented text annotations. This enables FLASH to outperform prior work in producing physically plausible grasps that align with goals specified via text. We use a pre-trained large language model as the backbone of our architecture, enabling generalization to novel prompts and objects.

Submission Number: 10

Loading