Abstract: Highlights•Introduce a framework for vision-language navigation purely with 3D semantic maps.•Compare settings in VLN tasks, highlighting the role of mapping in the task.•Analyze the pros and cons of our framework to inspire future research directions.
Loading