Keywords: Generative model, transform invariance, resonator network
TL;DR: We achieve translation invariant object classification using a neural network approach based on factorization
Abstract: The ability to handle invariant transformations is still an open problem in artificial intelli-
gence. General invariance to transformations like translation are not naturally learned by
supervised training, and many network architectures fail with input transformations not
covered by the training data. Here, we take an approach based on analysis-by-synthesis,
where a generative model describes the construction of simple scenes containing MNIST
digits and their transformations. Our approach defines the construction of objects within
the scene based on a set of sparse features that are then given an arbitrary translation and
color. The resonator network can then be defined to invert the generative model. Sparse
features learned from training data act as a basis set to provide flexibility in representing
variable shapes of objects. Through an iterative process, the network localizes objects and
factors out translation from the sparse features that compose the objects. Objects cen-
tered by the resonator network can then be classified using simple logistic regression or
deep learning. The classification layer is trained solely on centered data, requiring much
less training data, and the network as a whole can identify objects with arbitrary trans-
lations. The natural attention-like mechanism of the resonator network also allows for
analysis of scenes with multiple objects, where the network dynamics selects and centers
only one object at a time.
Submission Number: 170
Loading