Keywords: inversion, convolution layers, optimization, sparse linear solvers, block Kaczmarz, transposed convolution
TL;DR: A fast sparse linear solver dedicated to the convolutional layer inversion problem
Abstract: Data inversion in neural networks allows to map intermediate network variables
to their input source. Inversion of convolutional layers is not straightforward
and is often performed approximately by training additional inversion networks.
Approaching this as a linear operator inversion problem requires extremely large
computational and memory resources, as huge matrices are involved. In this work
we present Scalable TRimmed Iterative Projections (STRIP), a fast and sparse
linear solver dedicated to the convolutional inversion problem.
We take advantage of the neural convolution structure to design a series of very
fast projections (following the block Kaczmarz paradigm). We prove conditions for
convergence for the two-strip case and propose a measure to estimate the rate of
error reduction for the general case. In practice, we show that a single pass over
the inversion matrix by STRIP can almost perfectly solve the inversion problem.
Our algorithm is fast, low on memory and can scale to very large matrices. We do
not have to store the linear matrix to be inverted, hence can surpass by 3 orders of
magnitude linear sparse solvers, such as conjugate gradient. Extensive experiments
demonstrate that our method considerably outperforms the best competing solvers
by both speed and memory footprint. We further show that a single STRIP iteration
is more accurate than transposed convolutions, motivating the use of such methods
in U-Net architectures.
Primary Area: optimization
Submission Number: 10911
Loading