everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Building autonomous agents that can replicate human behavior in the realistic 3D world is a key step toward artificial general intelligence. This requires agents to be holistic goal achievers and to naturally adapt to environmental dynamics. In this work, we introduce ACTOR, an agent capable of performing high-level, long-horizon, abstract goals in 3D households, guided by its internal value similar to those of humans. ACTOR operates in a perceive-plan-act cycle, extending the ungrounded, scene-agnostic LLM controller with deliberate goal decomposition and decision-making through actively searching the behavior space, generating activity choices based on a hierarchical prior, and evaluating these choices using customizable value functions to determine the subsequent steps. Furthermore, we introduce BehaviorHub, a large-scale human behavior simulation dataset in scene-aware, complicated tasks. Considering the unaffordable acquisition of human-authored 3D human behavior data, we construct BehaviorHub by exploring the commonsense knowledge of LLMs learned from large corpora, and automatically aligning motion resources with 3D scene for knowledgeable generation. Extensive experiments on our established benchmark demonstrate that the proposed architecture leads to effective behavior planning and simulation. BehaviorHub also proves beneficial for downstream task development. Our code and dataset will be publicly released.