Triple Attention For Robust Video Crowd Counting

Qiyao Wu, Chongyang Zhang, Xiyu Kong, Muming Zhao, Yanjun Chen

Published: 2020, Last Modified: 30 Jan 2025ICIP 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Traditional static-image based crowd counting methods work well on public datasets. However, due to the complexity and variability of real-world scenarios, their performance tends to drop dramatically in practice. Aiming to solve the robust problem of crowd counting, we propose to use a co-attention mechanism to extract correlation features lying behind adjacent video frames which can enhance the distinguish-ability between background and foreground. Also, we combine co-attention with spatial attention and multi-scale self-attention. Three different and complementary attention-based modules jointly reinforce the robustness of the counting model. Experiments on two widely used video crowd datasets demonstrate the effectiveness of the proposed approach.