Keywords: Virtual worlds; Embodied AI; Embodied Tracking and Navigation; Visual RL;
TL;DR: A collection of photo-realistic 3D environments for benchmarking embodied AI agents.
Abstract: The embodied artificial intelligence agents should be capable of sensing, reasoning, planning, and acting in complex open worlds, which are unstructured, high-dynamic, and uncertain. To apply agents in the real world, the realism of the simulated worlds is important for training and evaluating the built agents. This paper introduces UnrealZoo, a rich collection of photo-realistic 3D environments that mimic the complexity and variability of the real world based on Unreal Engine. For embodied AI, we provide a diverse array of playable entities in the environments and a suite of tools, based on UnrealCV, for data collection, reinforcement learning, and evaluation. In the experiments, we benchmark the agent on visual navigation and tracking, two fundamental tasks for embodied vision agents, in complex open worlds. The results provide valuable insights into the strengths of enriching the diversity of the training environments and the challenges to current embodied vision agents in the open worlds, e.g., the latency in the closed-loop control to interact with the dynamic objects, reasoning the accordance of the spatial structure in the complex scenes.
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7171
Loading