TL;DR: A mathematical link between robust pattern recognition and pattern generation explains how visual systems can flexibly generate internal representations, including illusory percepts
Abstract: Visual illusions have long been considered perceptual mistakes, highlighting a perceived gap between biological and artificial vision. Here, we challenge this view by revealing that robust deep neural networks (DNNs) trained for object recognition implicitly contain a generative model capable of representing illusory contours and shapes. This finding suggests that illusions are not errors, but emergent properties of efficient visual processing. We uncover a mathematical correspondence between optimization for robust pattern recognition and optimization for pattern generation. This insight provides a potential explanation for how visual systems, primarily tuned for pattern recognition, can flexibly generate internal representations, including illusory percepts.
Using a robust object recognition model (ResNet50) trained on ImageNet, we demonstrate that the propagated errors during inference approximate the gradient of log conditional probability $p(x|y)$, directly linking recognition error to learned priors. By repurposing the computational graph conventionally used for learning, we query this implicit generative model without additional optimization.
When presented with classical illusion stimuli, our model generates representations that mirror perceptual experiences in biological vision. For a Kanizsa square input, edge-like patterns emerge in the perceived 'white square' area. With Rubin's vase, the network produces face-like or vase-like patterns depending on its training (VGGFace vs. ImageNet). These induced activities in early layers capture experimental findings of illusory contours and shapes in early visual areas across species. Our work reconciles the views of the visual cortex as both a pattern recognition and a generative model in a unified framework. By demonstrating that robust pattern recognition networks inherently embody generative capabilities, we provide insights into how the brain might integrate prior knowledge with sensory input. This suggests that visual illusions, far from being mistakes, are indicators of the visual system's ability to generate and manipulate internal representations—a feature crucial for efficient visual processing in complex, ambiguous environments.
Style Files: I have used the style files.
Debunking Challenge: This submission is an entry to the debunking challenge.
Submission Number: 72
Loading