Keywords: Primary Visual Cortex, Divisive Normalization, Object Recognition, Robustness, Common Corruptions, Biologically-Inspired Neural Networks
Abstract: Some convolutional neural networks (CNNs) have achieved state-of-the-art performance in object classification. However, they often fail to generalize to images perturbed with different types of common corruptions, impairing their deployment in real-world scenarios. Recent studies have shown that more closely mimicking biological vision in early areas such as the primary visual cortex (V1) can lead to some improvements in robustness. Here, we extended this approach and introduced at the V1 stage of a biologically-inspired CNN a layer inspired by the neuroscientific model of divisive normalization, which has been widely used to model activity in early vision. This new model family, the VOneNetDN, when compared to the standard base model maintained clean accuracy (relative accuracy of 99%) while greatly improving its robustness to common image corruptions (relative gain of 18%). The VOneNetDN showed a better alignment to primate V1 for some (contrast and surround modulation) but not all response properties when compared to the model without divisive normalization. These results serve as further evidence that neuroscience can still contribute to progress in computer vision.
Supplementary Material: zip