- Keywords: Biologically plausible deep learning, Recurrent Neural Networks, Perceptual grouping, horizontal connections, visual neuroscience, perceptual robustness, Gestalt psychology
- TL;DR: In this work, we present V1Net -- a novel recurrent neural network modeling cortical horizontal connections that give rise to robust visual representations through perceptual grouping.
- Abstract: The primate visual system builds robust, multi-purpose representations of the external world in order to support several diverse downstream cortical processes. Such representations are required to be invariant to the sensory inconsistencies caused by dynamically varying lighting, local texture distortion, etc. A key architectural feature combating such environmental irregularities is ‘long-range horizontal connections’ that aid the perception of the global form of objects. In this work, we explore the introduction of such horizontal connections into standard deep convolutional networks; we present V1Net -- a novel convolutional-recurrent unit that models linear and nonlinear horizontal inhibitory and excitatory connections inspired by primate visual cortical connectivity. We introduce the Texturized Challenge -- a new benchmark to evaluate object recognition performance under perceptual noise -- which we use to evaluate V1Net against an array of carefully selected control models with/without recurrent processing. Additionally, we present results from an ablation study of V1Net demonstrating the utility of diverse neurally inspired horizontal connections for state-of-the-art AI systems on the task of object boundary detection from natural images. We also present the emergence of several biologically plausible horizontal connectivity patterns, namely center-on surround-off, association fields and border-ownership connectivity patterns in a V1Net model trained to perform boundary detection on natural images from the Berkeley Segmentation Dataset 500 (BSDS500). Our findings suggest an increased representational similarity between V1Net and biological visual systems, and highlight the importance of neurally inspired recurrent contextual processing principles for learning visual representations that are robust to perceptual noise and furthering the state-of-the-art in computer vision.