A Dynamic Multimodal Fusion Framework for Forest Fire Detection

14 Nov 2025 (modified: 01 Dec 2025)IEEE MiTA 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Forest Fire Detection, Multimodal Fusion, Feature Extraction, Channel Attention Mechanism, Spatial Attention Mechanism
TL;DR: A Dynamic Multimodal Fusion Framework for Forest Fire Detection
Abstract: Accurate fire detection is crucial for forest fire prevention and ecological protection. Existing multimodal models face challenges in inter-modal data fusion and feature extraction, leading to performance degradation under complex backgrounds and environmental disturbances. To address these issues, this paper proposes a dynamic multimodal data fusion framework that integrates RGB images, infrared images, and environmental sensor data. First, environmental sensor data are normalized and visualized into image representations, then spatially aligned with the visual modalities to ensure consistency and fusion feasibility. Next, the multiscale feature extraction and optimization module, together with the global feature modeling module, is employed to capture both local and global features. Channel and spatial attention mechanisms are incorporated to enhance the representation of key fire-related regions, including flames, smoke, and high-temperature areas. Finally, a fusion layer is used for deep joint modeling of multimodal features. Experimental results demonstrate that the proposed method outperforms both unimodal and existing multimodal approaches in overall performance and classifier adaptability.
Submission Number: 47
Loading