Keywords: VLN, Cognitive mapping
TL;DR: Utilize brain-inspired mechanisms to construct cognitive maps for the path planning of unmanned aerial vehicles
Abstract: Navigating large-scale environments remains a major challenge for autonomous agents. Traditional methods often rely on detailed metric maps, whereas biological systems efficiently navigate using sparse, cognitive topological maps that support high-level reasoning. We present Topo-AeroVLN, a brain-inspired framework enabling unmanned aerial vehicles (UAVs) to perform vision-and-language navigation from a top-down perspective. Our method incrementally constructs a multi-level topological map by abstracting aerial observations into road-bounded regions and internal semantic objects. A dynamic graph update mechanism, combining multimodal embedding similarity with spatial containment, ensures efficient and scalable map construction. Multimodal Large Language Models (MLLMs) align natural language instructions with map vertices, supporting robust language-driven topological planning. Experiments demonstrate strong spatial coverage and navigation performance in complex urban environments. Topo-AeroVLN provides a generalizable, interpretable framework for UAV navigation that adapts to unseen environments without prior maps or extensive retraining.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 4254
Loading