Keywords: Dynamic hand grasping, Benchmark
Abstract: Most existing hand grasping benchmarks focus on static objects, which fails to capture the challenges of dynamic, real-world scenarios where targets move and precise timing becomes critical. We first propose the Dynamic Grasp Suite (DGS), a unified platform for dynamic grasp evaluation, and Dyana-12M, a large-scale benchmark with 12M frames of human-hand dynamic grasp trajectories. Dyana-12M represents target motion with three interpretable trajectories: straight-line, circular-arc, and simple-harmonic, which compose into arbitrarily complex trajectories. DGS standardizes interfaces and protocols, supporting the evaluation of three major model zoo: vision–language–action (VLA) agents, diffusion policies, and vision–language models (VLMs).
Together, DGS and Dyana-12M establish a new paradigm for dynamic grasping, shifting evaluation from static scenes to motion-aware, temporally aligned assessment at scale.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Submission Number: 7141
Loading