Frequency-Space Interaction With Hierarchical Aggregation Network for Lightweight Smoke Image Segmentation
Abstract: Many methods tend to adopt more complex modules to improve smoke segmentation accuracy, but complex methods cannot achieve required processing speeds in computation-limited devices. To address this challenge, we propose a lightweight model with greatly minimized parameters to achieve competitive smoke segmentation performance. Specifically, we leverage Fourier transforms to enable feature interaction between spatial and frequency domains, and design a Frequency-Spatial Interaction Block (FSIB) to accurately encode features and recover details. Additionally, considering morphological variations and diverse characteristics of smoke, we introduce a Group Multi-Dilated Fusion module (GMDF) between the encoder and decoder to expand receptive fields for capturing more details. Furthermore, we employ a hierarchical feature aggregation strategy to further improve the presentation ability of features. Based on these modules, we construct a Frequency-Spatial Interaction Hierarchical Aggregation Network (FSIHAN) for achieving efficient smoke segmentation. Extensive experiments on two benchmark smoke datasets demonstrate that our FSIHAN outperforms various lightweight architectures and smoke segmentation methods. On the SYN70K test set, our method achieves a 79.4% mIoU with only 1.77M parameters, reducing the parameter numbers by approximately 57x compared to the state-of-the-art SAGINN. The base model of FSIHAN has only 0.46M parameters, leading to a reduction of about 220x compared to SAGINN.
External IDs:dblp:journals/tce/LiYW25
Loading