F-AttNet: Towards Multi-scale Feature Fusion for Fashion Attribute PredictionDownload PDFOpen Website

2021 (modified: 17 Feb 2023)IJCNN 2021Readers: Everyone
Abstract: Large-scale attribute recognition in fashion retail images is a crucial task in image-based recommendation systems. The challenges are due to the visually-similar instances, localized minute information and overlapping features. Moreover, the class imbalance further exacerbates the challenge, needing for a specific solution to alleviate the problem. In this work, F-AttNet architecture is proposed, which is designed by the hierarchical alignment of the novel Attentive Multi-scale Feature (AMF) encoder blocks. AMF encoders extract mid-level multi-scale fine-grained attribute features involving multiple representations of low-level features and finally, the high-level global description is encoded by adaptively calibrating the channel weights. For improving the training performance, a novel gamma-variant focal loss is developed to handle class imbalance by assigning more penalty and assigning relative weights to positive and negative instances shifting the focus of the network to false instances. Experimental results and ablation studies of F-AttNet using a large-scale fashion attribute recognition database iMaterialist-2018 demonstrate significant performance improvement than the state-of-the-art methodologies.
0 Replies

Loading