- TL;DR: DNNs for image segmentation can implement solutions for the insideness problem but only some recurrent nets could learn them with a specific type of supervision.
- Abstract: Image segmentation aims at grouping pixels that belong to the same object or region. At the heart of image segmentation lies the problem of determining whether a pixel is inside or outside a region, which we denote as the "insideness" problem. Many Deep Neural Networks (DNNs) variants excel in segmentation benchmarks, but regarding insideness, they have not been well visualized or understood: What representations do DNNs use to address the long-range relationships of insideness? How do architectural choices affect the learning of these representations? In this paper, we take the reductionist approach by analyzing DNNs solving the insideness problem in isolation, i.e. determining the inside of closed (Jordan) curves. We demonstrate analytically that state-of-the-art feed-forward and recurrent architectures can implement solutions of the insideness problem for any given curve. Yet, only recurrent networks could learn these general solutions when the training enforced a specific "routine" capable of breaking down the long-range relationships. Our results highlights the need for new training strategies that decompose the learning into appropriate stages, and that lead to the general class of solutions necessary for DNNs to understand insideness.
- Keywords: Image Segmentation, Deep Networks for Spatial Relationships, Visual Routines, Recurrent Neural Networks
- Original Pdf: pdf