Learning multi-level graph attentional representation for thermal infrared object tracking

Published: 2025, Last Modified: 31 Oct 2025Eng. Appl. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Thermal infrared (TIR) object tracking is a fundamental task in computer vision that is not affected by changes in lighting conditions. It performs better than visible light trackers in extreme environments such as nighttime, heavy rain, haze, and sandstorms. However, TIR object tracking also faces challenges such as occlusion, thermal crossover, motion blur, and similarity interference. Unlike visual tracking, TIR images lack color information and texture features. These factors make it challenging to learn detailed and shape features of the targets, making it hard to distinguish between targets and interference effectively. In this study, we propose a graph-based deep learning model, SiamMLGR, within the Siamese framework for stable TIR object tracking to address these issues. Specifically, to extract more fine-grained features of TIR targets, we propose a multiple graph attention module (MGAM) to replace the global matching information transmission method in the Siamese framework. This module constructs a graph structure to establish local and global connections between the target and the search area. Furthermore, to retain more of the features learned by the MGAM, we propose a spatial graph convolutional module (SGCM), which uses an explicit graph adjacency matrix to propagate information between the attention graphs. Additionally, we incorporate large-scale datasets from the visual tracking field into the model training process. By mixing these with TIR datasets, we address the sample imbalance issue present in pure TIR datasets. Extensive experimental results indicate that the proposed method achieves state-of-the-art performance.
Loading