Holistic Advances through Large-Scale Embodied Dialog Augmentation for Navigation

18 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: VLN, embodied dialog
Abstract: For embodied agents capable of physical interaction, dialog capability is crucial to ensure both safety and effectiveness. While DialNav provides a framework for holistic evaluation of the dialog--execution loop in photorealistic indoor navigation, its performance is constrained. In this work, we introduce holistic advances spanning data and training. First, we develop a large-scale dialog generation pipeline to enhance coverage and diversity. Second, we propose task-aligned training for the Navigator to better reflect the dynamic dialog–navigation loop. Finally, we address the bottleneck of localization with a stronger graph-aware transformer model. Together, these advances more than double success rates over prior baselines, achieving 58.24% SR on Val Seen and 29.05% on Val Unseen, establishing a new state of the art in dialog-driven embodied navigation.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 9986
Loading