Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex

Jea Kwon; Kyungwoo Song; C. Justin Lee

Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex

Jea Kwon, Kyungwoo Song, C. Justin Lee

Published: 02 Mar 2024, Last Modified: 05 May 2024ICLR 2024 Workshop Re-Align PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 9 pages)

Keywords: Lp-Convolution, Receptive Field, Multivariate p-generalized normal distribution, Representation Similarity, Visual Cortex, Gaussian Sparsity

TL;DR: Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex

Abstract: Convolutional Neural Networks (CNNs) have profoundly influenced the field of computer vision, drawing significant inspiration from the visual processing mechanisms inherent in the brain. Despite sharing fundamental structural and representational similarities with the biological visual system, differences in local connectivity patterns within CNNs open up an interesting area to explore. In this work, we explore whether integrating biologically observed receptive fields (RFs) can enhance model performance and foster alignment with brain representations. We introduce a novel methodology, termed $L_p$-convolution, which employs the multivariate $p$-generalized normal distribution as an adaptable $L_p$-masks, to reconcile disparities between artificial and biological RFs. $L_p$-masks finds the optimal RFs through task-dependent adaptation of conformation such as distortion, scale, and rotation. This allows $L_p$-convolution to excel in tasks that require flexible RF shapes, including not only square-shaped regular RFs but also horizontal and vertical ones. Furthermore, we demonstrate that $L_p$-convolution with biological RFs significantly enhances the performance of large kernel CNNs possibly by introducing structured sparsity inspired by $p$-generalized normal distribution in convolution. Lastly, we present that neural representations of CNNs align more closely with the visual cortex when $L_p$-convolution is close to biological RFs. This research shines a light on the potential of brain-inspired models that merge insights from neuroscience and machine learning, with the hope of bridging the gap between artificial and biological visual systems.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 6

Loading