One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

ICLR 2026 Conference Submission15256 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: bimanual robotic manipulation, one-shot learning, demonstration synthesis, imitation learning
TL;DR: BiDemoSyn synthesizes diverse, real-world bimanual demonstrations from a single example using vision-guided adaptation and hierarchical optimization, removing the need for simulation or manual data collection in scalable imitation learning.
Abstract: Learning dexterous bimanual manipulation policies critically depends on large-scale, high-quality demonstrations, yet current paradigms face inherent trade-offs: teleoperation provides physically grounded data but is prohibitively labor-intensive, while simulation-based synthesis scales efficiently but suffers from sim-to-real gaps. We present BiDemoSyn, a framework that synthesizes contact-rich, physically feasible bimanual demonstrations from a single real-world example. The key idea is to decompose tasks into invariant coordination blocks and variable, object-dependent adjustments, then adapt them through vision-guided alignment and lightweight trajectory optimization. This enables the generation of thousands of diverse and feasible demonstrations within several hour, without repeated teleoperation or reliance on imperfect simulation. Across six dual-arm tasks, we show that policies trained on BiDemoSyn data generalize robustly to novel object poses and shapes, significantly outperforming recent baselines. By bridging the gap between efficiency and real-world fidelity, BiDemoSyn provides a scalable path toward practical imitation learning for complex bimanual manipulation without compromising physical grounding.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 15256
Loading