Abstract: Spatial Transformer layers allow neural networks, at
least in principle, to be invariant to large spatial trans-
formations in image data. The model has, however, seen
limited uptake as most practical implementations support
only transformations that are too restricted, e.g. affine or
homographic maps, and/or destructive maps, such as thin
plate splines. We investigate the use of flexible diffeo-
morphic image transformations within such networks and
demonstrate that significant performance gains can be at-
tained over currently-used models. The learned transfor-
mations are found to be both simple and intuitive, thereby
providing insights into individual problem domains. With
the proposed framework, a standard convolutional neural
network matches state-of-the-art results on face verification
with only two extra lines of simple TensorFlow code.
Loading