The Sparse-Plus-Low-Rank Quasi-Newton Method for Entropic-Regularized Optimal Transport

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: A new quasi-Newton method to solve entropic-regularized optimal transport, which combines the advantages of Hessian sparsification and low-rank approximation, with provable convergence properties.
Abstract: The entropic-regularized optimal transport (OT) has gained massive attention in machine learning due to its ability to provide scalable solutions for OT-based tasks. However, most of the existing algorithms, including the Sinkhorn algorithm and its extensions, suffer from relatively slow convergence in many cases. More recently, some second-order methods have been proposed based on the idea of Hessian sparsification. Despite their promising results, they have two major issues: first, there is limited theoretical understanding on the effect of sparsification; second, in cases where the transport plan is dense, Hessian sparsification does not perform well. In this paper, we propose a new quasi-Newton method to address these problems. First, we develop new theoretical analyses to understand the benefits of Hessian sparsification, which lays the foundation for highly flexible sparsification schemes. Then we introduce an additional low-rank term in the approximate Hessian to better handle the dense case. Finally, the convergence properties of the proposed algorithm are rigorously analyzed, and various numerical experiments are conducted to demonstrate its improved performance in solving large-scale OT problems.
Lay Summary: Optimal transport is a powerful mathematical tool used to compare data distributions — imagine matching piles of sand to holes with minimal effort. It is used in machine learning to compare images, text, or any data with structure. A modern version called entropic-regularized optimal transport makes this task faster and more stable, but the popular algorithms used for it can still be slow in practice. This paper introduces a new algorithm that speeds things up by improving how the system handles large numerical objects in the computation. The key idea is to approximate a large matrix, called the Hessian matrix, in a smarter way. We use the sum of two simpler structures to approximate the Hessian matrix — one is a sparse matrix that contains many zero elements, and the other is a low-rank matrix that can be represented by simple vectors. In this way, we can capture both the small important details and the overall big picture of the large Hessian matrix. Our new method is faster, more reliable, and better suited for large-scale data tasks. This makes optimal transport more practical for real-world machine learning problems.
Link To Code: https://github.com/Aoblex/numerical-experiments
Primary Area: Optimization->Convex
Keywords: optimal transport, entropic regularization, quasi-Newton method, Hessian sparsification, low-rank approximation, convergence analysis
Submission Number: 11085
Loading