Abstract: Hyperspectral videos contain richer spectral and physical features than RGB videos and thus have greater potential for use in object tracking. The mainstream hyperspectral object tracking approach involves the integration of multiple RGB-based video tracking models. Although ensembles of multiple models can effectively utilize spectral information and improve tracker performance, this approach has high computational complexity, making it difficult to meet the real-time requirements of video object tracking. To bridge the gap, we propose a new hyperspectral object tracking framework (HotMoE) based on Mixture-of-Experts (MoE). HotMoE leverages a divide-and-conquer strategy, where only a subset of expert models is computed for each input, reducing computational complexity while maintaining performance. In this paper, we first design a splitter to group multiple spectral bands into multiple false-color images based on spectral correlations. Then, we design a hyperspectral MoE router that can adaptively learn to aggregate spectral image feature information and route it to suitable experts. Different experts can handle various scenarios, and HotMoE effectively utilizes the capabilities of different experts to obtain better overall performance. Compared with previous state-of-the-art hyperspectral object tracking networks, our model has significantly reduced inference time and performs well, with a processing speed of 43.7 FPS and an AUC of 0.704 with the HOT2022 dataset.
External IDs:dblp:journals/tmm/SunTLHLSWS25
Loading