TL;DR: A novel measure in CNN based on angular similarity that is shown to correlate strongly with human visual hardness with gains in applications such as self-training.
Abstract: The mechanisms behind human visual systems and convolutional neural networks (CNNs) are vastly different. Hence, it is expected that they have different notions of ambiguity or hardness. In this paper, we make a surprising discovery: there exists a (nearly) universal score function for CNNs whose correlation with human visual hardness is statistically significant. We term this function as angular visual hardness (AVH) and in a CNN, it is given by the normalized angular distance between a feature embedding and the classifier weights of the corresponding target category. We conduct an in-depth scientific study. We observe that CNN models with the highest accuracy also have the best AVH scores. This agrees with an earlier finding that state-of-art models tend to improve on classification of harder training examples. We find that AVH displays interesting dynamics during training: it quickly reaches a plateau even though the training loss keeps improving. This suggests the need for designing better loss functions that can target harder examples more effectively. Finally, we empirically show significant improvement in performance by using AVH as a measure of hardness in self-training tasks.
Code: https://drive.google.com/drive/folders/1AqAhFI93cGT4uut05c5rQWxCEeeVA_TG?usp=sharing
Keywords: angular similarity, self-training, hard samples mining
Original Pdf: pdf
11 Replies
Loading