Overcoming Output Dimension Collapse: When Sparsity Enables Zero-shot Brain-to-image Reconstruction at Small Data Scales
Abstract: Advances in brain-to-image reconstruction are enabling us to externalize the subjective visual experiences encoded in the brain as images. A key challenge in this task is data scarcity: a translator that maps brain activity to latent image features is trained on a limited number of brain-image pairs, making the translator a bottleneck for zero-shot reconstruction beyond the training stimuli. In this paper, we mathematically analyze the behavior of two translators commonly used in recent reconstruction pipelines: naive multivariate linear regression and sparse multivariate linear regression. We define the data scale as the ratio of the number of training samples to the latent feature dimensionality and characterize the behavior of each model across data scales. Building on a standard structural property of naive multivariate regression, we first show that the resulting ''output dimension collapse'' can become a practical generalization bottleneck in brain-to-image reconstruction. We introduce the best prediction diagnostic, which is computable without brain activity, to quantify the practical impact of this collapse. We then analyze sparse linear regression models in a student--teacher framework and derive expressions for the prediction error in terms of data scale and other sparsity-related parameters. Our analysis clarifies when variable selection can reduce prediction error at small data scales by exploiting the sparsity of the brain-to-feature mapping. Our findings provide quantitative guidelines for diagnosing output dimension collapse and for designing effective translators and feature representations for zero-shot reconstruction.
Submission Type: Long submission (more than 12 pages of main content)
Code: https://github.com/KamitaniLab/OvercomingOutputDimensionCollapse
Supplementary Material: zip
Assigned Action Editor: ~Zhengzhong_Tu1
Submission Number: 7036
Loading