Learning Attribute-driven Disentangled Representations for Interactive Fashion Retrieval

Yuxin Hou, Eleonora Vig, Michael Donoser, Loris Bazzani

08 Nov 2022OpenReview Archive Direct UploadReaders: Everyone

Abstract: Interactive retrieval for online fashion shopping pro- vides the ability to change image retrieval results according to the user feedback. One common problem in interactive retrieval is that a specific user interaction (e.g., changing the color of a T-shirt) causes other aspects to change inad- vertently (e.g., the retrieved item has a sleeve type different than the query). This is a consequence of existing methods learning visual representations that are semantically entan- gled in the embedding space, which limits the controllability of the retrieved results. We propose to leverage on the se- mantics of visual attributes to train convolutional networks that learn attribute-specific subspaces for each attribute to obtain disentangled representations. Thus operations, such as swapping out a particular attribute value for another, impact the attribute at hand and leave others untouched. We show that our model can be tailored to deal with dif- ferent retrieval tasks while maintaining its disentanglement property. We obtain state-of-the-art performance on three interactive fashion retrieval tasks: attribute manipulation retrieval, conditional similarity retrieval, and outfit com- plementary item retrieval.

0 Replies