Abstract: Video-based Face Recognition (VFR) can be converted to the matching of two image sets containing face images captured from each video. For this purpose, we propose to bridge the two sets with a reference image set that is well-defined and pre-structured to a number of local models offline. In other words, given two image sets, as long as each of them is aligned to the reference set, they are mutually aligned and well structured. Therefore, the similarity between them can be computed by comparing only the corresponded local models rather than considering all the pairs. To align an image set with the reference set, we further formulate the problem as a quadratic programming. It integrates three constrains to guarantee robust alignment, including appearance matching cost term exploiting principal angles, geometric structure consistency using affine invariant reconstruction weights, smoothness constraint preserving local neighborhood relationship. Extensive experimental evaluations are performed on three databases: Honda, MoBo and YouTube. Compared with competing methods, our approach can consistently achieve better results.
0 Replies
Loading