BiAssemble: Learning Collaborative Affordance for Bimanual Geometric Assembly

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We introduce BiAssemble, a framework that learns bimanual collaborative affordances for long-horizon geometric shape assembly task.
Abstract: Shape assembly, the process of combining parts into a complete whole, is a crucial skill for robots with broad real-world applications. Among the various assembly tasks, geometric assembly—where broken parts are reassembled into their original form (e.g., reconstructing a shattered bowl)—is particularly challenging. This requires the robot to recognize geometric cues for grasping, assembly, and subsequent bimanual collaborative manipulation on varied fragments. In this paper, we exploit the geometric generalization of point-level affordance, learning affordance aware of bimanual collaboration in geometric assembly with long-horizon action sequences. To address the evaluation ambiguity caused by geometry diversity of broken parts, we introduce a real-world benchmark featuring geometric variety and global reproducibility. Extensive experiments demonstrate the superiority of our approach over both previous affordance-based and imitation-based methods.
Lay Summary: Geometric shape assembly is a fundamental problem across multiple domains, including archaeology, where fragmented artifacts must be reconstructed to support cultural heritage restoration, and robotics and manufacturing, where assembling broken or modular components is essential for tasks such as object repair, packaging, and furniture construction. For robots, assembling broken parts into their original shapes is especially challenging when the parts vary in shape and size. It requires strong visual understanding, precise manipulation, and coordinated use of both arms. In this work, we teach two-arm (bimanual) robots how to reassemble broken objects from different categories. The robot first learns to understand the geometry of each broken part, avoiding picking them up in ways that would make assembly difficult—such as grabbing them by their broken edges. Then, using both arms, the robot aligns two pieces at their broken edges, and gradually brings them together to complete the object. Our method builds on a concept called visual affordance, which helps the robot understand where and how to grasp based on the geometry of the parts. We extend this idea of affordance learning to support bimanual coordination over long-horizon action sequences to complete the assembly task. To evaluate our approach, we create a new benchmark with real-world examples of broken objects with diverse shapes. Our experiments show that the robot can successfully assemble different types of objects, both in simulation and in real-world settings.
Primary Area: Applications->Robotics
Keywords: Bimanual Manipulation, Geometric Shape Assembly
Submission Number: 9866
Loading