Abstract: In this work, we propose a computation-efficient encoder-decoder architecture, named MobileCount, which is specifically designed for high-accuracy real-time crowd counting on mobile or embedded devices with limited computation resources. For the encoder part, MobileNetV2 is tailored in order to significantly reduce FLOPs at a little cost of performance drop, which has 4 bottleneck blocks preceded by a max pooling layer of stride 2. The design of decoder is motivated by Light-weight RefineNet, which further boosts counting performance with only a $$10\%$$ increase of FLOPs. In comparison with state-of-the-arts, our proposed network is able to achieve comparable counting performance with 1/10 FLOPs on a number of benchmarks.
0 Replies
Loading