MMIDNet: A Multilevel Mutual Information Disentanglement Network for Cross-Domain Infrared Small Target Detection
Abstract: Infrared small target detection (IRSTD) plays a vital role in surveillance and early warning applications. However, current IRSTD models suffer from limited sample sizes and scene diversity, leading to domain shifts due to discrepancies between synthetic training data and real-world test images. To address this challenge, we propose a novel multilevel mutual information disentanglement network (MMIDNet) to enhance cross-domain robustness in IRSTD. Specifically, we first design a clustering-based guided data augmentation strategy to adjust brightness and contrast based on image clustering adaptively. Second, we introduce local target-guided negative augmentation, applying random rotations to target regions and masks to enhance pose diversity and adaptability. Furthermore, we propose a dual-decoder architecture with an auxiliary image reconstruction branch that operates only during training. This design enables the separation of domain-specific and domain-invariant features by applying mutual information constraints at multiple levels. Finally, the model is trained on the synthetic NUDT-SIRST dataset and extensively evaluated on multiple real-world IRSTD benchmark datasets, including IRSTD-1 k, SIRST, and SIRST-v2. Experimental results demonstrate that MMIDNet significantly outperforms existing state-of-the-art methods in cross-domain generalization.
External IDs:dblp:journals/staeors/LiHWHYT25
Loading