High performance RGB-Thermal Video Object Detection via hybrid fusion with progressive interaction and temporal-modal difference
Abstract: Highlights•A hybrid fusion strategy network for RGB-Thermal video object detection.•An early strategy for reducing modal disparities.•A novel differential method for modeling multimodal and temporal information.•The proposed PTMNet achieves SOTA performance on the VT-VOD50 dataset.
Loading