Keywords: optimal transport, Markov chains, bisimulation
Abstract: We propose a new framework for formulating optimal transport distances between
Markov chains. Previously known formulations studied couplings between the
entire joint distribution induced by the chains, and derived solutions via a reduction
to dynamic programming (DP) in an appropriately defined Markov decision process.
This formulation has, however, not led to particularly efficient algorithms so far,
since computing the associated DP operators requires fully solving a static optimal
transport problem, and these operators need to be applied numerous times during
the overall optimization process. In this work, we develop an alternative perspective
by considering couplings between a “flattened” version of the joint distributions
that we call discounted occupancy couplings, and show that calculating optimal
transport distances in the full space of joint distributions can be equivalently
formulated as solving a linear program (LP) in this reduced space. This LP
formulation allows us to port several algorithmic ideas from other areas of optimal
transport theory. In particular, our formulation makes it possible to introduce an
appropriate notion of entropy regularization into the optimization problem, which
in turn enables us to directly calculate optimal transport distances via a Sinkhorn-
like method we call Sinkhorn Value Iteration (SVI). We show both theoretically and
empirically that this method converges quickly to an optimal coupling, essentially
at the same computational cost of running vanilla Sinkhorn in each pair of states.
Along the way, we point out that our optimal transport distance exactly matches
the common notion of bisimulation metrics between Markov chains, and thus our
results also apply to computing such metrics, and in fact our algorithm turns out to
be significantly more efficient than the best known methods developed so far for
this purpose.
Submission Number: 61
Loading