Abstract: Biological sensory systems appear to rely on canonical nonlinear computations that can be readily adapted to a broad range of representational objectives. Here we test the hypothesis that one such computation—multiplicative interaction—is a pervasive nonlinearity that underlies the representational transformations in human vision. We computed local multiplicative interactions of features in several classes of convolutional models and used the resulting representations to predict object-evoked responses in voxelwise models of human fMRI data. We found that multiplicative interactions predicted widespread representations throughout the ventral stream and were competitive with state-of-the-art supervised deep nets. Surprisingly, the performance of multiplicative interactions did not require supervision and could be achieved even with random or hand-engineered convolutional filters. These findings suggest that multiplicative interaction may be a canonical computation for feature transformations in human vision.
5 Replies
Loading