Abstract: We present a novel hypothesis on norms of representations produced by convolutional neural networks (CNNs). In particular, we propose the norm-count hypothesis (NCH), which states that there is a monotonically increasing relationship between the number of certain objects in the image, and the norm of the corresponding representation. We formalize and prove our hypothesis in a controlled setting, showing that the NCH is true for linear and batch normalized CNNs followed by global average pooling, when they are applied to a certain class of images. Further, we present experimental evidence that corroborates our hypothesis for CNN-based representations. Our experiments are conducted with several real-world image datasets, in both supervised, self-supervised, and few-shot learning – providing new insight on the relationship between object counts and representation norms.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The revision addresses comments by the reviewers. In addition to several minor clarifications and improvements, we have:
1. Extended the theoretical analysis to include batch normalized CNNs.
2. Added experimental results on MS-COCO.
3. Added experimental results with Dense Contrastive Learning [CVPR-2021].
Assigned Action Editor: ~Neil_Houlsby1
Submission Number: 1442
Loading