- Abstract: Convolutional Neural Networks (CNNs) have emerged as highly successful tools for image generation, recovery, and restoration. This success is often attributed to large amounts of training data. On the contrary, a number of recent experimental results suggest that a major contributing factor to this success is that convolutional networks impose strong prior assumptions about natural images. A surprising experiment that highlights this structural bias towards simple, natural images is that one can remove various kinds of noise and corruptions from a corrupted natural image by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to this single image. While this over-parameterized model can eventually fit the corrupted image perfectly, surprisingly after a few iterations of gradient descent one obtains the uncorrupted image, without using any training data. This intriguing phenomena has enabled state-of-the-art CNN-based denoising as well as regularization in linear inverse problems such as compressive sensing. In this paper we take a step towards demystifying this experimental phenomena by attributing this effect to particular architectural choices of convolutional networks, namely fixed convolutional operations. We then formally characterize the dynamics of fitting a two layer convolutional generator to a noisy signal and prove that early-stopped gradient descent denoises/regularizes. This results relies on showing that convolutional generators fit the structured part of an image significantly faster than the corrupted portion.
- Keywords: theory for deep learning, convolutional network, deep image prior, deep decoder, dynamics of gradient descent, overparameterization
- Code: https://www.dropbox.com/s/vtvavzry9sp5wrj/overparameterized_convolutional_generators-master.zip?dl=0
- Original Pdf: pdf