Understanding Scale Shift in Domain Generalization for Crowd Localization

23 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Crowd Localization, Domain Generalization, Scale Shift
TL;DR: We present a ScaleBench to benchmark scale shift domain generalization, and propose rigorous theoretical analysis for this issue, which further motivates an algorithm called SemanticHook to support the following research on this issue.
Abstract: Crowd localization plays a crucial role in visual scene understanding towards predicting each pedestrian location in a crowd, thus being applicable to various downstream tasks. However, existing approaches suffer from significant performance degradation due to differences in head scale distributions (scale shift) between training and testing data, a challenge known as domain generalization (DG). This paper aims to comprehend the nature of scale shift within the context of domain generalization for crowd localization models. To this end, we address three key questions: (i) how to quantify the scale shift influence on DG task, (ii) why does this influence occur, (iii) how to mitigate the influence. Specifically, we first establish a benchmark, ScaleBench, and reproduce 20 advanced DG algorithms, to quantify the influence. Through extensive experiments, we demonstrate the limitations of existing algorithms and highlight the under-explored nature of this issue. To further understand its behind reason, we provide a rigorous theoretical analysis on scale shift. Building on this analysis, we further propose a simple yet effective algorithm called Semantic Hook to mitigate the influence of scale shift on DG, which also serves as a case study revealing three significant insights for future research. Our results emphasize the importance of this novel and applicable research direction, which we term $\textit{Scale Shift Domain Generalization}$.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2971
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview