Convolutional architectures are cortex-aligned de novo

Atlas Kazemian, Eric Elmoznino, Michael F. Bonner

Published: 13 Nov 2025, Last Modified: 19 Feb 2026Nature Machine IntelligenceEveryoneRevisionsCC BY-SA 4.0

Abstract: What underlies the emergence of cortex-aligned representations in deep neural network models of vision? Earlier work suggested that shared architectural constraints were a major factor, but the success of widely varied architectures after pretraining raises critical questions about the importance of architectural constraints. Here we show that in wide networks with minimal training, architectural inductive biases have a prominent role. We examined networks with varied architectures but no pretraining and quantified their ability to predict image representations in the visual cortices of monkeys and humans. We found that cortex-aligned representations emerge in convolutional architectures that combine two key manipulations of dimensionality: compression in the spatial domain, through pooling, and expansion in the feature domain by increasing the number of channels. We further show that the inductive biases of convolutional architectures are critical for obtaining performance gains from feature expansion—dimensionality manipulations were relatively ineffective in other architectures and in convolutional models with targeted lesions. Our findings suggest that the architectural constraints of convolutional networks are sufficiently close to the constraints of biological vision to allow many aspects of cortical visual representation to emerge even before synaptic connections have been tuned through experience. Kazemian et al. report that untrained convolutional networks with wide layers predict primate visual cortex responses nearly as well as task-optimized networks, revealing how architectural constraints shape brain-like representations in deep networks.

External IDs:doi:10.1038/s42256-025-01142-3