Geon3D: Benchmarking 3D Shape Bias towards Building Robust Machine Vision

Yutaro Yamada; Yuval Kluger; Sahand Negahban; Ilker Yildirim

Geon3D: Benchmarking 3D Shape Bias towards Building Robust Machine Vision

Yutaro Yamada, Yuval Kluger, Sahand Negahban, Ilker Yildirim

08 Jun 2021 (modified: 24 May 2023)Submitted to NeurIPS 2021 Datasets and Benchmarks Track (Round 1)Readers: Everyone

Keywords: robust vision, robustness, adversarial examples, common corruptions, 3D reconstruction, vision science

Abstract: Human vision, unlike existing machine vision systems, is surprisingly robust to environmental variation, including both naturally occuring disturbances (e.g., fog, snow, occlusion) and artificial corruptions (e.g., adversarial examples). Such robustness, at least in part, arises from our ability to infer 3D geometry from 2D retinal projections---the ability to go from images to their underlying causes, including the 3D scene. How can we design machine learning systems with such strong shape bias? In this work, we view 3D reconstruction as a pretraining method for building more robust vision systems. Recent studies explore the role of shape bias in the robustness of vision models. However, most current approaches to increase shape bias based on ImageNet take an indirect approach, attempting to instead reduce texture bias via structured data augmentation. These approaches do not directly nor fully exploit the relationship between 2D features and their underlying 3D shapes. To fill this gap, we introduce a novel dataset called Geon3D, which is derived from objects that emphasize variation across shape features that the human visual system is thought to be particularly sensitive. This dataset enables, for the first time, a controlled setting where we can isolate the effect of ``3D shape bias'' in robustifying neural networks, and informs more direct approaches to increase shape bias by exploiting 3D vision tasks. Using Geon3D, we find that CNNs pretrained on 3D reconstruction are more resilient to viewpoint change, rotation, and shift than regular CNNs. Further, when combined with adversarial training, 3D reconstruction pretrained models improve adversarial and common corruption robustness over vanilla adversarially-trained models. This suggests that incorporating 3D shape bias is a promising direction for building robust machine vision systems.

Supplementary Material: zip

URL: https://drive.google.com/uc?id=1v5XwO-QrnB_j9XhJJl4c7d7hMQf-v6gq&export=download

8 Replies

Loading