Keywords: Out of distribution, generalization, convolution, polar transformation
Abstract: Humans understand a set of canonical geometric transformations (such as translation, rotation and scaling) that support generalization by being untethered to any
specific object. We explored inductive biases that allowed artificial neural networks to learn these transformations in pixel space in a way that could generalize
out-of-distribution (OOD). Unsurprisingly, we found that convolution and high
training diversity were important contributing factors to OOD generalization of
translation to untrained shapes, sizes, time-points and locations, however these
weren’t sufficient for rotation and scaling. To remedy this we show that two more
principle components are needed 1) iterative training where outputs are fed back
as inputs 2) applying convolutions after conversion to log-polar space. We propose POLARAE which exploits all four components and outperforms standard
convolutional autoencoders and variational autoencoders trained iteratively with
high diversity wrt OOD generalization to larger shapes in larger grids and new
locations.
5 Replies
Loading