Dynamic Scene Generation for Embodied Navigation Benchmark

Published: 26 Jun 2024, Last Modified: 09 Jul 2024DGR@RSS2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Embodied Environment, Dynamic Scene Generation, Human Simulation
TL;DR: We propose a framework that utilizes LLM to simulate human behaviors for the scalable generation of dynamic scenes that may serve as a potential benchmark for embodied AI.
Abstract: Although embodied agents have been widely studied in multifarious tasks within abundant benchmarks, studies in dynamic scenarios have not been sufficiently supported by large-scale dynamic scenes. Many existing works aim at probabilistic environments where the daily object may be moved due to human activities, however, the scale of the dataset is usually limited due to the cost of human annotation or manual configuration. Toward the scalable generation of such dynamic scenes, we introduce a framework that simulates human activities and corresponding object dynamics with Large Language Models (LLMs) and apply the simulated human residents to embodied scenes. A user study that compares our generated scene dynamics with other approaches validates that our framework successfully produces believable and diversified data, which have a quality comparable to human annotations. We further conduct object goal navigation experiments under various problem settings with representative baselines on dynamic scenes. The results verify the potential of generated scenes to serve as navigation benchmarks while suggesting that dynamic scenes introduce new challenges and problems to embodied navigation. Our work contributes as an infrastructure that may facilitate future studies on embodied AI in dynamic environments. A visualization and online demonstration of our framework dynamic scene generation is available at https://huggingface.co/spaces/JW0003/DynamicSceneGeneration
Submission Number: 11
Loading