Hierarchical Scene Graphs and Contact-Aware Behavior Trees for Learning and Executing Bimanual Manipulation

IEEE ICRA 2026 Workshop CR2 Submission20 Authors

Published: 06 May 2026, Last Modified: 16 May 2026CR2@ICRA2026 PosterEveryoneRevisionsCC BY 4.0
Keywords: contact-rich manipulation, scene graphs, behavior trees, bimanual manipulation, learning from demonstration
TL;DR: A unified framework for contact-rich bimanual manipulation that combines scene graphs, contact reasoning, human demonstrations, and behavior trees for generalizable, reactive execution in shelf restocking and luggage handling.
Abstract: Scene graphs have become a standard representation for semantic scene understanding in robotics, yet they typically encode only static spatial and categorical relations. We argue that for contact-rich bimanual manipulation, contact and affordance relations must be elevated to first-class entities within the scene graph rather than being handled exclusively at the controller level. We present a concrete architectural proposal in which a hierarchical scene graph is augmented with explicit contact states, affordance edges, and temporal interaction dynamics, and show how this enriched representation naturally interfaces with behavior trees for reactive bimanual execution. Using supermarket shelf replenishment as a running example, we illustrate how the proposed representation captures the structure of bimanual sequencing: from bilateral approach and asymmetric first contact, through grasp establishment and shared transport, to supported placement and release. We describe our ongoing pipeline for extracting these contact-augmented graph sequences from human demonstrations and discuss the design choices, open challenges, and expected benefits of treating contact as semantic state.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 20
Loading