Abstract: Infrared and visible image fusion aims to extract complementary features to synthesize a single fused image. In our method, we covert the regular image format into the graph space and conduct graph convolutional networks (GCNs) to extract NLss for the reliable infrared and visible image fusion. More specifically, GCNs are first performed on each intra-modal set to aggregate the features and propagate the inherent information, thereby extracting independent intra-modal NLss. Then, such intra-modal non-local self-similarity (NLss) features of infrared and visible images are concatenated to explore cross-domain NLss inter-modally and reconstruct the fused images. Extensive experiments show the superior performance of our method with the qualitative and quantitative analysis on the TNO, RoadScene and M3FD datasets, respectively, outperforming many state-of-the-art (SOTA) methods for the robust and effective infrared and visible image fusion. Note to Practitioners—This paper was motivated by the problem that the most existing methods that employ convolutional neural networks (CNNs) and transformer-based frameworks mainly extract local features and long-range dependence. However, they often cause overlooking the image’s NLss or information redundancy, resulting in poor infrared and visible image fusion. To address these problems, graph-based data representations can construct relationships among spatially repeatable details or textures with far-space distances, which are more suitable for handling irregular objects. Therefore, it is significant to covert the regular image format into the graph space and conduct graph convolutional networks (GCNs) to extract NLss for the reliable infrared and visible image fusion. In this paper, we develop an infrared and visible image fusion method based on graph representation learning strategy.
Loading