Abstract: Nowadays, deep learning models have reached incred-
ible performance in the task of image generation. Plenty
of literature works address the task of face generation and
editing, with human and automatic systems that struggle to
distinguish what’s real from generated. Whereas most sys-
tems reached excellent visual generation quality, they still
face difficulties in preserving the identity of the starting in-
put subject. Among all the explored techniques, Semantic
Image Synthesis (SIS) methods, whose goal is to generate
an image conditioned on a semantic segmentation mask,
are the most promising, even though preserving the per-
ceived identity of the input subject is not their main concern.
Therefore, in this paper, we investigate the problem of iden-
tity preservation in face image generation and present an
SIS architecture that exploits a cross-attention mechanism
to merge identity, style, and semantic features to generate
faces whose identities are as similar as possible to the input
ones. Experimental results reveal that the proposed method
is not only suitable for preserving the identity but is also ef-
fective in the face recognition adversarial attack, i.e. hiding
a second identity in the generated faces.
Loading