Abstract: In this paper, we address the problem of person reidentification (re-id), which remains to be challenging due to view point changes, pose variations, different camera settings, etc. Different from common methods that concatenate descriptors extracted from different support regions and feature channels directly as a long vector, we encode the importance of different feature channels and support regions explicitly and propose a two-layer distance ensemble model called DeNet to measure the similarity between two images. The first layer of DeNet combines distances of different support regions while the second layer weights different feature channels. Weight parameters of DeNet are learnt under the large margin framework with the goal of maximizing the difference between distances of positive and negative matching pairs. Our method achieves very competitive results on the widely used VIPeR and PRID 450S datasets.
0 Replies
Loading