A lightweight robust RGB-T object tracker based on Jitter Factor and associated Kalman filter

Published: 2025, Last Modified: 08 Nov 2025Inf. Fusion 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Visual object tracking has made significant contributions in many practical applications, but it remains a great challenge when the camera moves/shakes or the target is occluded. Various solutions leveraging deep-learning (DL) techniques have been introduced to address these challenging factors. However, these DL-based methods can hardly be implemented on an edge computing platform due to its limited computational resources. In this study, we propose a lightweight and robust cross-modal fusion RGB-T object tracker for edge computing platforms based on Jitter Factor and associated Kalman filter. In the proposed tracker, visible and infrared features of the target are extracted and fused using a cross-modal fusion strategy based on the modal reliability. Meanwhile, the newly proposed Jitter Factor, derived from image morphology, is used to judge the motion of camera. Once the camera motion is detected, target position would be corrected via global image registration and associated Kalman filter. Experimental results on RGBT234 and GTOT datasets indicates that the proposed lightweight tracking method outperforms other non-DL-based tracking methods. For the problem of camera motion, it exhibits a competitive performance among other DL-based trackers, but with faster speed (25 FPS using only a CPU).
Loading