Filling in the details: Perceiving from low fidelity visual inputDownload PDF

29 Mar 2024 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone
Abstract: Humans perceive their surroundings in great detail even though most of our visual field is reduced to low-fidelity color-deprived (e.g., dichromatic) input by the retina. In contrast, most deep learning architectures deploy computational resources homogeneously to every part of the visual input. Is such a prodigal deployment of resources necessary? In this paper, we present a framework for investigating the extent to which connectionist architectures can perceive an image in full detail even when presented with low acuity, distorted input. Our goal is to initiate investigations that will be fruitful both for engineering better networks and also for eventually testing hypotheses on the neural mechanisms responsible for our own visual system's ability to perceive missing information. We find that networks can compensate for low acuity input by learning global feature functions that allow the network to fill in some of the missing details. For example, the networks accurately perceive shape and color in the periphery, even when 75\% of the input is achromatic and low resolution. On the other hand, the network is prone to similar mistakes as humans; for example, when presented with a fully grayscale landscape image, it perceives the sky as blue when the sky is actually a red sunset.
TL;DR: Using generative models to create images from impoverished input similar to those received by our visual cortex
Conflicts: cs.umass.edu, harvard.edu, cs.umb.edu
Keywords: Deep learning, Computer vision, Semi-Supervised Learning
15 Replies

Loading