V1T: large-scale mouse V1 response prediction using a Vision Transformer

Bryan M. Li; Isabel Maria Cornacchia; Nathalie Rochefort; Arno Onken

V1T: large-scale mouse V1 response prediction using a Vision Transformer

Bryan M. Li, Isabel Maria Cornacchia, Nathalie Rochefort, Arno Onken

Published: 17 Aug 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Accurate predictive models of the visual cortex neural response to natural visual stimuli remain a challenge in computational neuroscience. In this work, we introduce $V{\small 1}T$, a novel Vision Transformer based architecture that learns a shared visual and behavioral representation across animals. We evaluate our model on two large datasets recorded from mouse primary visual cortex and outperform previous convolution-based models by more than 12.7% in prediction performance. Moreover, we show that the self-attention weights learned by the Transformer correlate with the population receptive fields. Our model thus sets a new benchmark for neural response prediction and can be used jointly with behavioral and neural recordings to reveal meaningful characteristic features of the visual cortex.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: N/A

Code: https://github.com/bryanlimy/V1T

Supplementary Material: zip

Assigned Action Editor: ~Simon_Kornblith1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1155

Loading