Learning Representative Deep Features for Image Set Analysis

Zifeng Wu, Yongzhen Huang, Liang Wang

Published: 2015, Last Modified: 02 Apr 2026IEEE Trans. Multim. 2015EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.
Loading