DSMGN: Dual-Supervised Mask Generation Network for Infrared and Visible Image Fusion

Published: 01 Jan 2023, Last Modified: 28 Jan 2025IEEE Multim. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Due to the lack of reference images for the training of infrared and visible image fusion (IVIF) network, the deep learning models cannot fuse the modal features of different source images well, resulting in fusion results that are biased toward one modality. This study proposes an IVIF method based on a dual-supervised mask generation network (DSMGN) that includes three parts: an encoder–decoder-based backbone network and two image-generation branches. In the backbone network, multiple residual dense involution blocks are constructed to extract the salient features of infrared images so an accurate mask image can be generated. The generated mask image is then used to define the fusion strategy to obtain the fusion results. To solve the problem of no reference images for network training, two image-generation branches are designed based on the gray-level cooccurrence matrix and Gaussian blur to generate two types of images that focus on different features of the source images. These two generated images are used to define a joint loss function to supervise the network training. Numerous experiments indicate that, compared with certain state-of-the-art methods, DSMGN achieves better fusion results in both subjective and objective aspects.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview