Abstract: Interactive retrieval for online fashion shopping pro-
vides the ability to change image retrieval results according
to the user feedback. One common problem in interactive
retrieval is that a specific user interaction (e.g., changing
the color of a T-shirt) causes other aspects to change inad-
vertently (e.g., the retrieved item has a sleeve type different
than the query). This is a consequence of existing methods
learning visual representations that are semantically entan-
gled in the embedding space, which limits the controllability
of the retrieved results. We propose to leverage on the se-
mantics of visual attributes to train convolutional networks
that learn attribute-specific subspaces for each attribute to
obtain disentangled representations. Thus operations, such
as swapping out a particular attribute value for another,
impact the attribute at hand and leave others untouched.
We show that our model can be tailored to deal with dif-
ferent retrieval tasks while maintaining its disentanglement
property. We obtain state-of-the-art performance on three
interactive fashion retrieval tasks: attribute manipulation
retrieval, conditional similarity retrieval, and outfit com-
plementary item retrieval.
0 Replies
Loading