Editorial: Recent advances in artificial neural networks and embedded systems for multi-source image fusion

Xin Jin; Jingyu Hou; Shin-Jye Lee; Dongming Zhou

Editorial: Recent advances in artificial neural networks and embedded systems for multi-source image fusion

Xin Jin, Jingyu Hou, Shin-Jye Lee, Dongming Zhou

Published: 01 Jan 2022, Last Modified: 13 Nov 2024Frontiers Neurorobotics 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Multi-source visual information fusion can help the robotic system to perceive the real world, and image fusion is a computational technique fusing the multi-source images from multiple sensors into a synthesized image that provides either comprehensive or reliable description. At present, a lot of braininspired algorithms methods (or models) are aggressively proposed to accomplish imge fusion task, and the artificial neural network has become one of the most popular techniques in processing multi-source image fusion, especially deep convolutional neural networks. This is an exciting research field for the research community of image fusion and there are many interesting issues that remain to be explored, such as deep fewshot learning, unsupervised learning, application of embodied neural systems, and industrial applications.How to develop a sound biological neural network and embedded system to fuse the multiple features of source images are basically two key questions that need to be addressed in the field of multi-source image fusion. Hence, studies of image fusion can be divided into two aspects: first, new end-to-end neural network models for merge constituent parts during the image fusion process; second, the embodiment of artificial neural networks for image fusion systems. In addition, current booming techniques, including deep neural systems and embodied artificial intelligence systems, are considered as potential future trends for reinforcing the performance of image fusion.In the first work entitled "Multi-Focus Color Image Fusion Based on Quaternion Multi-Scale Singular Value Decomposition (QMSVD)", Wan et al. employed multichannel quaternion multi-scale singular value to decomposite the multifocus color images, and a set of low-frequency and highfrequency sub-image was obtained. The activity level and matching level are exploited in the focus decision mapping of the low-frequency sub-image fusion, and a local contrast fusion rule based on the integration of high-frequency and lowfrequency regions is proposed. At last, the fused images are reconstructed by inverse QMSVD. Experiments reveal that the color image fusion method have competitive visual effects.The visual quality of images will seriously affect by bad weather conditions, specially on foggy days. To remove the fog in image, Liu et al. introduced a method entitled "Single Image Defogging Method Based on Image Patch Decomposition and Multi-Exposure Image Fusion". In this method, the authors proposed a single image defogging method based on image patch decomposition and multi-exposure fusion, which did not use any a priori knowledge of the scene depth information. First, a single foggy image was processed to produce a set of underexposed images, and then the underexposed and original images was enhanced and fused by guided filter and patch operation.To protect the Tujia brocades, which is one of the intangible cultural heritage, Shuqi He introduced a method using an unsupervised clustering algorithm for Tujia brocades segmentation, and a K auto-selection based on information fusion was also used. In this method, the cluster number K was calculated by fusing local binary patterns and gray-level cooccurrence matrix characteristic values. Thus, clustering and segmentation operation can be performed on Tujia brocade image by adopting a Gaussian mixture model to get a rough preliminary segmentation image. Then, voting optimization and conditional random filtering is used to optimize the preliminary segmentation to produce the final result.In the fourth paper, Wu et al. proposed a fractional waveletbased generative scattering networks (FrScatNets), in which fractional wavelet scattering networks was used as the encoder to extract image features and deconvolutional neural networks was used as the decoder to generate an image. Moreover, the authors also developed a feature-map fusion method to reduce the dimensionality of FrScatNet embeddings. Besides, the atuhors also discussed the application of image fusion in this study.Conventional tensor decomposition is a kind of approximate decomposition model, the image details may be lost in fused image reconstruction. matrix product state of tensor". In this work, the source images were first separated into third-order tensor, so that the tensor can be decomposed into a matrix product form by singular value decomposition, and then the Sigmoid function was employed to fuse the key components. Thus, the fused image can be reconstructed by multiplying all the fused tensor components. Lin et al. introduced an integrated circuit board object detection and image augmentation fusion model based on YOLO. In this paper, the authors first analyzed several popular region-based convolutional neural network and YOLO models, and then they proposed a real-time image recognition model for integrated circuit board (ICB) in manufacturing process. The authors first constructed an ICBs training dataset, and a preliminary image recognition model was then established to classify and predict ICBs. Finally, image augmentation fusion and optimization methods were used to improve the accuracy of the method.Yu et al. reported a bottom-up visual saliency model in wavelet domain. In this method, wavelet transform was first performed on the image to get four channels, and then discrete cosine transform was used to get the magnitude spectra and corresponding signum spectra. Third, wavelet decomposed multiscale magnitude spectra for every single channel was produced. Fourth, six multiscale conspicuity maps were generated for every single channel, and then the multiscale conspicuity maps of the four channels were fused. At last, a final saliency map after scale-wise combination was got. The experimental results show that the proposed model is effective.Shi et al. proposed an ensemble model for graph networks on imbalanced node classification, which used GNNs as the base classifiers during boosting. In this method, higher weights were set for the training samples that were not correctly classified by the previous classifiers. Besides, transfer learning was also employed to reduce computational cost and increase fitting ability. Experiments show that the proposed method achieved better performance than graph convolutional network.Deep neural networks were proven vulnerable to attack against adversarial examples, Xie et al. proposed a new noise data enhancement method, which only transformed adversarial perturbation to improve the transferability of adversarial examples with a noise data enhancement and random erasing. Experiments have proved the effectiveness of this method.Because GAN-based method is difficult to converge to the distribution of face space in training completely. Yang et al. proposed a face-swapping method based on pretrained StyleGAN generator, and designed a control strategy of the generator based on the idea of encoding and decoding to complete this task. Experiments show that the performance of the proposed method were better than other state-of-the-art methods.In the paper entitled "adaptive fusion based method for imbalanced data classification", Liang et al. proposed a ensemble method that combined data transformation and an adaptive weighted voting scheme for imbalanced data classification. They first utilized a modified metric learning to obtain an feature space based on imbalanced data, and then the base classifiers were assigned different weights adaptively. The experiments on multiple imbalanced datasets were performed to verify the performance of this algorithm.In the work entitled "Multi-Exposure Image Fusion Algorithm Based on Improved Weight Function", Xu et al. proposed a multi-exposure image fusion method based on Laplacian pyramid. Based on Laplacian pyramid decomposition, an improved weight function was used to capture source image details. Six multi-exposure image fusion methods were compared with the proposed method on 20 sets of multi-exposure image sequences.Sketch face recognition can match cross-modality facial image from sketch to photo, which is important in criminal investigation. Guo et al. introduced an effective cross task modality alignment network for sketch face recognition, and a meta learning training episode strategy was introduced to address the small sample problem. In this work, they proposed a two-stream network to capture modality-specific and sharable features, and two cross task memory mechanisms were also proposed to improve the performance of feature learning. At last, a cross task modality alignment loss was proposed to train the mdoel.

Loading