Abstract: Pedestrian counting has been a challenging topic, especially in video surveillance, for a long time due to the view variations, scale changes, and spatial occlusions. While most of the previous approaches try to count people within one frame, our approach addresses this problem with a group context model, which is to segment individuals into groups and model the spatiotemporal relationships between them. With the basic definitions of the group state, group event, and group relative, a group correspondence matrix is built to model the bidirectional correspondences between the groups in two consecutive frames. Then, a group context is modeled with a sequence of context masks, which encodes not only the spatiotemporal changes within a group, but also the historical relevance and spatial dependency between different groups. Finally, we assemble context masks from multiple frames and formulate the problem of pedestrian counting as a joint maximum a posteriori problem. Markov-chain Monte Carlo is utilized to search for an optimal configuration set to match the group context model. Comprehensive experiments on the PETS2009 data set and UCSD pedestrian data set show the promising performance of the proposed approach.
Loading