Mutual learning generative adversarial network

Lin Mao, Meng Wang, Dawei Yang, Rubo Zhang

Published: 2024, Last Modified: 10 May 2025Multim. Tools Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: It is the key to realize high fidelity image-to-image translation to realize the precise disentangling of single domain feature based on the establishment of the internal correlation between source and target domain. In order to improve the problem of difficult disentanglement and weak correlation with cross-domain features, this paper designs a feature regroup and redistribution module, to achieve feature hierarchical processing and feature interaction in a mutual space for controllable image-to-image translation. In the feature regroup unit, pyramid with different frequency intervals are designed to extract content feature such as multi-level spatial structure and global color semantic information. Further, the output of frequency pyramid is mapped into mutual pool for cross-domain feature difference comparison and similarity learning to achieve accurate analysis. In the redistribution unit, the mutual pool output and single domain feature are fused in the form of spatial attention to correct content and style feature transmission error. We also design a mutual learning generative adversarial network based on the RR module, which can satisfy minimum errors image-to-image translation in real scenes. The experiment results on BDD100K and Sim10k datasets show that FID, IS, KID_mean, and KID_stddev have greatly improved.