Agent-to-Sim: Learning Interactive Behavior Model from Casual Longitudinal Videos

Published: 22 Jan 2025, Last Modified: 11 Feb 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: dynamic 3d reconstruction; multi-video registration; motion generation
TL;DR: Given monocular videos collected across a long time horizon (e.g., 1 month), we build interactive behavior models of an agent grounded in a 3D environment.
Abstract: We present Agent-to-Sim (ATS), a framework for learning interactive behavior models of 3D agents in a 3D environment from casually-captured videos. Different from prior works that rely on marker-based tracking and multiview cameras, ATS learns natural behaviors of animal and human agents in a non-invasive way, directly from monocular video collections. Modeling 3D behavior of an agent requires persistent 3D tracking (e.g., knowing which point corresponds to which) over a long time period. To obtain such data, we develop a coarse-to-fine registration method that tracks the agent and the camera over time through a canonical 3D space, resulting in a complete and persistent spacetime 4D representation. We then train a generative model of agent behaviors using paired data of perception and motion of an agent queried from the 4D reconstruction. ATS enables real-to-sim transfer of agents in their familiar environments given longitudinal video recordings (e.g., over a month). We demonstrate results on pets (e.g., cat, dog, bunny) and human given monocular RGBD video collections captured by a smartphone.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4756
Loading