Improving face recognition in surveillance video with judicious selection and fusion of representative frames

Zhaozhen Ding, Qingfang Zheng, Chunhua Hou, Guang Shen

Published: 01 Jan 2020, Last Modified: 25 Jan 2025MMAsia 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Face recognition in unconstrained surveillance videos is challenging due to the different acquisition settings and face variations. We propose to utilize the complementary correlation between multi-frames to improve face recognition performance. We design an algorithm to build a representative frame set from the video sequence, selecting faces with high quality and large appearance diversity. We also devise a refined Deep Residual Equivariant Mapping (DREAM) block to improve the discriminative power of the extracted deep features. Extensive experiments on two relevant face recognition benchmarks, YouTube Face and IJB-A, show the effectiveness of the proposed method. Our work is also lightweight, and can be easily embedded into existing CNN based face recognition systems.