Keywords: Geometry, Cognitive Science, Psychology, Vision, Robustness, Safety
TL;DR: We introduce an extremely simple shape detection task that elicits fundamental, inconsistent phenomenon between humans and widely used neural network models
Abstract: It is well-known that modern computer vision systems often exhibit behaviors misaligned with those of humans: from adversarial attacks to image corruptions, deep
learning vision models suffer in a variety of settings that humans capably handle. In
light of these phenomena, here we introduce another, orthogonal perspective studying the human-machine vision gap. We revisit the task of recovering images under
degradation, first introduced over 30 years ago in the Recognition-by-Components
theory of human vision. Specifically, we study the performance and behavior of
neural networks on the seemingly simple task of classifying regular polygons at
varying orders of degradation along their perimeters. To this end, we implement the
Automated Shape Recoverability Test
for rapidly generating large-scale datasets
of perimeter-degraded regular polygons, modernizing the historically manual creation of image recoverability experiments. We then investigate the capacity of
neural networks to recognize and recover such degraded shapes when initialized
with different priors. Ultimately, we find that neural networks’ behavior on this
simple task conflicts with human behavior, raising a fundamental question of the
robustness and learning capabilities of modern computer vision models
Supplementary Material: pdf
Submission Number: 428
Loading