Keywords: Deep Q-learning, Gauge equivariance, Manifolds, Reinforcement learning
TL;DR: We introduce a natural deep Q-learning framework for environments embedded in discrete manifolds.
Abstract: Data, at any point on a manifold, can be represented on the tangent plane at that point with respect to a basis, called a gauge. But the choice of gauge is not unique for arbitrary manifolds. Hence, for agents traversing an environment embedded on a manifold, the same environment may appear differently if the choice of gauge changes or when moving to a different point that has a different gauge. This may be deleterious to an agent's learning, as compared to learning on, say, a flat grid where it is easy to choose a fixed gauge for each point. To this end, we provide a formulation of deep Q-learning that learns policies (and Q-values) that are equivariant (invariant) to changes in choice of gauge. This leads to an efficient learning algorithm independent of the choice of gauge. Our experimental results demonstrate significant improvement in learning on novel environments embedded in arbitrary manifolds such as spheres, hills, and urns, compared to naive approaches.