Interactive Clothes Image Retrieval via Multi-modal Feature Fusion of Image Representation and Natural Language Feedback
Abstract: Clothes image retrieval is an element task that has attracted research interests during the past decades. While in most case that single retrieval cannot achieve the best retrieval performance, we consider to develop an interactive image retrieval system for fashion outfit search, where we utilize the natural language feedback provided by the user to grasp compound and more specific details for clothes attributes. In detail, our model is divided into two parts: feature fusion part and similarity metric learning part. The fusion module is used for combining the feature vectors of the modified description and the feature vectors of the image part. It is then optimized in an end-to-end method via a matching objective, where we have adopted contractive learning strategy to learn the similarity metric. Extensive simulations have been conducted. The simulation results show that the compared with other complex multi-model proposed in recent years, our work improves the model performance while keeping the model simple in architecture.
0 Replies
Loading