ChatAni: Language-Driven Multi-Actor Animation Generation in Street Scenes

TMLR Paper7233 Authors

29 Jan 2026 (modified: 06 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Generating interactive and realistic traffic participant animations from instructions is essential for autonomous driving simulations. Existing methods, however, fail to comprehensively address the diverse participants and their dynamic interactions in street scenes. In this paper, we present ChatAni, the first system capable of generating interactive, realistic, and controllable multi-actor animations based on language instructions. To produce fine-grained, realistic animations, ChatAni introduces two novel animators: PedAnimator, a unified multi-task animator that generates interaction-aware pedestrian animations under varying task plans, and VehAnimator, a kinematics-based policy that generates physically plausible vehicle animations. For precise control through complex language, ChatAni employs a multi-LLM-agent role-playing approach, using natural language to plan the trajectories and behaviors of different participants. Extensive experiments demonstrate that ChatAni can generate realistic street scenes with interacting vehicles and pedestrians, benefiting tasks like prediction and understanding. All related code, data, and checkpoints will be open-sourced.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Chen_Sun1
Submission Number: 7233
Loading