Keywords: Computer Vision, Crowd Counting, Semi-Supervised Learning
Abstract: This paper focuses on semi-supervised crowd counting, where only a small portion of the training data are labeled. We formulate the pixel-wise density value to regress as a probability distribution, instead of a single deterministic value, and utilize a dual-branch structure to model the corresponding discrete form of the distribution function. On the basis, we propose a semi-supervised crowd counting model. Firstly, we enhance the transformer decoder by usingdensity tokens to specialize the forwards of decoders w.r.t. different density intervals; Secondly, we design a pixel-wise distribution matching loss to measure the differences in the pixel-wise density distributions between the prediction and the ground-truth; Thirdly, we propose an interleaving consistency regularization term to align the prediction of two branches and make them consistent. Extensive experiments on four datasets are performed to show that our method clearly outperforms the competitors by a large margin under various labeled ratio settings.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/semi-supervised-counting-via-pixel-by-pixel/code)
17 Replies
Loading