Optimization-Diverse Transformer Adaptation with Gated Heterogeneous Fusion for Robust Thermal Object Detection

Published: 27 Apr 2026, Last Modified: 27 Apr 2026MaCVi PosterEveryoneRevisionsCC BY 4.0
Keywords: Optimization-Diverse Transformer Adaptation, Gated Heterogeneous Fusion, Thermal Object Detection
TL;DR: We improve thermal object detection by combining optimization-diverse transformer ensembling with gated CNN–transformer fusion for stable high-IoU localization.
Abstract: Thermal object detection is inherently challenging due to low contrast, weak boundary definition, and reduced semantic texture. While transformer-based detectors provide strong global context modeling, direct adaptation from large-scale RGB pretraining often leads to localization instability under strict IoU evaluation. In this work, we present a contrast-aware transformer adaptation framework with optimization-diverse ensembling and gated heterogeneous fusion for robust thermal detection. We adapt DEIMv2–DINOv3-x using controlled layer-wise fine-tuning and hybrid encoder scaling to stabilize transfer to the thermal domain. Multi-seed training introduces optimization diversity in transformer query alignment, and Stage-1 Weighted Boxes Fusion aggregates these diverse predictions into a stable consensus detector (DEIM3). Auxiliary CNN detectors (YOLOv7 and YOLOv13) are incorporated conservatively via Stage-2 gated fusion with asymmetric weighting and agreement-based filtering to prevent precision degradation. Experimental results on the MaCVi 2026 Thermal Object Detection Challenge demonstrate that multi-seed transformer ensembling significantly improves localization stability. Furthermore, our gated heterogeneous fusion preserves high-level transformer precision while benefiting from CNN-based local cues. Our approach achieved a top-tier ranking on the official leaderboard, outperforming most participating teams and validating the effectiveness of controlled optimization diversity and selective cross-architecture fusion in low-texture thermal environments.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 2
Loading