EgoDemoGen: Novel Egocentric Demonstration Generation Enables Viewpoint-Robust Manipulation

04 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: imitation learning; novel demonstration generation; video generation; robot learning;
Abstract: Imitation learning based policies perform well in robotic manipulation, but they often degrade under \emph{egocentric viewpoint shifts} when trained from a single egocentric viewpoint. To address this issue, we present \textbf{EgoDemoGen}, a framework that generates \emph{paired} novel egocentric demonstrations by retargeting actions in the novel egocentric frame and synthesizing the corresponding egocentric observation videos with proposed generative video repair model \textbf{EgoViewTransfer}, which is conditioned by a novel-viewpoint reprojected scene video and a robot-only video rendered from the retargeted joint actions. EgoViewTransfer is finetuned from a pretrained video generation model using self-supervised double reprojection strategy. We evaluate EgoDemoGen on both simulation (RoboTwin2.0) and real-world robot. After training with a mixture of EgoDemoGen-generated novel egocentric demonstrations and original standard egocentric demonstrations, policy success rate improves \textbf{absolutely} by \textbf{+17.0\%} for standard egocentric viewpoint and by \textbf{+17.7\%} for novel egocentric viewpoints in simulation. On real-world robot, the \textbf{absolute} improvements are \textbf{+18.3\%} and \textbf{+25.8\%}. Moreover, performance continues to improve as the proportion of EgoDemoGen-generated demonstrations increases, with diminishing returns. These results demonstrate that EgoDemoGen provides a practical route to egocentric viewpoint-robust robotic manipulation.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 2009
Loading