ASM-GAN: Arbitrary Object Shape Styles Manipulation in Synthesized Images for GANs

Published: 01 Jan 2025, Last Modified: 15 May 2025IEEE Trans. Instrum. Meas. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Utilizing deep learning-based algorithms to perform vision-based measurement (VBM) tasks is in the ascendant, especially for detecting defectives (i.e., objects in our study) in the industrial product inspection domain. The deep learning algorithms heavily rely on sufficiently labeled defective samples with different shape styles. However, insufficient samples with limited defective shape styles are often the case in many real-world industrial applications. Although existing studies utilize generative models to attempt to manipulate new, arbitrary shape styles, which are never observed in the training samples, of synthesized objects, they have limitations: 1) models that control the representations to manipulate shape styles can cause representation learning failure and 2) approaches discretize representations of objects for manipulating arbitrary object shape styles might bring low generation quality. In this article, we propose a new generation approach, named arbitrary shape manipulation generative adversarial net (ASM-GAN), to address the issues. We utilize the vector quantization (VQ) method to discretely learn the representations extracted by an encoder–decoder, which forms the generator. It helps the model learn the representations of contents (e.g., surroundings, context, and color) of the target objects in original images, then the model makes the synthesized object rendered by those contents after manipulating its shape styles with segmentation. Besides, we innovatively present the Hausdorff metric to make the discrete learning in VQ for improving the generation quality, given that the Kullback–Leibler (KL) divergence adopted by the traditional VQ could be $\infty $ or 0, which certainly hurts the generation quality. We demonstrate the significant improvement performance of ASM-GAN in synthesizing high-fidelity images with new shape styles for objects over SOTA baseline models from both theoretical and empirical aspects.
Loading