Overcoming Output Dimension Collapse: When Sparsity Enables Zero-shot Brain-to-image Reconstruction at Small Data Scales
Abstract: Advances in brain-to-image reconstruction are enabling us to externalize the subjective visual experiences encoded in the brain as images.
A key challenge in this task is data scarcity: a translator that maps brain activity to latent image features is trained on a limited number of brain-image pairs, making the translator a bottleneck for zero-shot reconstruction beyond the training stimuli.
In this paper, we provide a theoretical analysis of two translator designs widely used in recent reconstruction pipelines: naive multivariate linear regression and sparse multivariate linear regression.
We define the data scale as the ratio of the number of training samples to the latent feature dimensionality and characterize the behavior of each model across data scales.
We first show that the naive linear regression model, which uses a shared set of input variables for all outputs, suffers from ``output dimension collapse'' at small data scales, restricting generalization beyond the training data.
We then analyze sparse linear regression models in a student--teacher framework and derive expressions for the prediction error in terms of data scale and other sparsity-related parameters.
Our analysis clarifies when variable selection can reduce prediction error at small data scales by exploiting the sparsity of the brain-to-feature mapping.
Our findings provide quantitative guidelines for diagnosing output dimension collapse and for designing effective translators and feature representations for zero-shot reconstruction.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Zhengzhong_Tu1
Submission Number: 7036
Loading