Abstract: Generating images of digital fashion models dressed with
a curated outfit has various applications especially when these fashion
models can be conditioned on different poses, body sizes, etc. In this
paper, we propose novel conditioning architectures for diffusion models
for generating curated outfits to be rendered on a digital human in
predefined pose. The conditioned outfits are fed through information
pathways including learned deepset embedding and cross-attention with
pose skeleton, allowing for a strong conditioning signal for subject-driven
generation. Such an outfit renderer a) allows to scale fashion imagery
to millions of outfit combinations b) enables unprecedented access to
creative control over studio content generation c) provides high level of
personalization because users could explore or complete outfits on-demand
and also in their own likeness d) a stepping stone towards 2D virtual try
on which unlike 3D virtual try-on does not require dedicated hardware
Loading