Keywords: Visual Navigation, Dynamic Neural Network
Abstract: Visual navigation is essential for robotics and embodied AI. However, existing foundation models, particularly those with transformer decoders, suffer from high computational overhead and lack interpretability, limiting their deployment on edge devices. To address this, we propose DynaNav, a Dynamic Visual Navigation framework that adapts feature and layer selection based on scene complexity. It employs a trainable hard feature selector for sparse operations, enhancing efficiency and interpretability. Additionally, we integrate feature selection into an early-exit mechanism, with Bayesian Optimization determining optimal exit thresholds to reduce computational cost. Extensive experiments in real-world-based datasets and simulated environments demonstrate the effectiveness of DynaNav. Compared to ViNT, DynaNav achieves a $2.6\times$ reduction in FLOPs, 42.3% lower inference time, and 32.8% lower memory usage while improving navigation performance across four public datasets.
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 13926
Loading