Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: applications to neuroscience & cognitive science
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Lp-Convolution, Receptive Field, Multivariate p-generalized normal distribution, Representation Similarity, Visual Cortex, Gaussian Sparsity
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex
Abstract: Convolutional Neural Networks (CNNs) have profoundly influenced the field of computer vision, drawing significant inspiration from the visual processing mechanisms inherent in the brain. Despite sharing fundamental structural and representational similarities with the biological visual system, differences in local connectivity patterns within CNNs open up an interesting area to explore. In this work, we explore whether integrating biologically observed receptive fields (RFs) can enhance model performance and foster alignment with brain representations. We introduce a novel methodology, termed $L_p$-convolution, which employs the multivariate $p$-generalized normal distribution as an adaptable $L_p$-masks, to reconcile disparities between artificial and biological RFs. $L_p$-masks finds the optimal RFs through task-dependent adaptation of conformation such as distortion, scale, and rotation. This allows $L_p$-convolution to excel in tasks that require flexible RF shapes, including not only square-shaped regular RFs but also horizontal and vertical ones. Furthermore, we demonstrate that $L_p$-convolution with biological RFs significantly enhances the performance of large kernel CNNs possibly by introducing Gaussian-like structured sparsity in convolution. Lastly, we present that neural representations of CNNs align more closely with the visual cortex when $L_p$-convolution is close to biological RFs. This research shines a light on the potential of brain-inspired models that merge insights from neuroscience and machine learning, with the hope of bridging the gap between artificial and biological visual systems.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3144
Loading