$DA^2$-VPR: Dynamic Architecture for Domain-Aware Visual Place Recognition

17 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Visual Place Recognition, Parameter-Efficient Fine Tuning, Foundation model, Feature aggregation
Abstract: Visual Place Recognition (VPR) systems struggle with training-to-test domain shifts caused by environmental changes such as lighting, weather, and seasonal variations. Existing methods rely on input-invariant strategies with fixed parameters, which restrict their ability to cope with diverse test conditions. We propose Dynamic Architecture for Domain Aware Visual Place Recognition ($DA^2$-VPR), a dynamic feature modulation framework that adapts representations according to input scene characteristics. By dynamically modulating features across spatial and channel dimensions using foundation model features as conditioning signals, our method effectively narrows the training-to-testing gap. Our framework consists of: (1) a dynamic adapter that adjusts representations to scene conditions, (2) a transformer aggregator with adaptive query generation from input features, and (3) domain-variance augmentation with texture and appearance modifications. Experiments on challenging VPR benchmarks with significant domain shifts show that $DA^2$-VPR consistently outperforms input-invariant baselines, demonstrating superior generalization and establishing new state-of-the-art results.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 8372
Loading