Abstract: Optimization with orthogonality constraints frequently arises in various fields such as machine learning. Riemannian optimization offers a powerful framework for solving these problems by equipping the constraint set with a Riemannian manifold structure and performing optimization intrinsically on the manifold. This approach typically involves computing a search direction in the tangent space and updating variables via a retraction operation. However, as the size of the variables increases, the computational cost of the retraction can become prohibitively high, limiting the applicability of Riemannian optimization to large-scale problems. To address this challenge and enhance scalability, we propose a novel approach that restricts each update on a random submanifold, thereby significantly reducing the per-iteration complexity. We introduce two sampling strategies for selecting the random submanifolds and theoretically analyze the convergence of the proposed methods. We provide convergence results for general nonconvex functions and functions that satisfy Riemannian Polyak–Łojasiewicz condition as well as for stochastic optimization settings. Additionally, we demonstrate how our approach can be generalized to quotient manifolds derived from the orthogonal manifold. Extensive experiments verify the benefits of the proposed method, across a wide variety of problems.
Lay Summary: In machine learning, we often encounter optimization problems where we need to find the best solution while satisfying orthogonality constraints. These constraints ensure that certain variables remain independent. However, as dimensionality grow larger, traditional methods for solving these problems become computationally expensive, making them impractical for large-scale applications.
To address this, we propose a novel method based on Riemannian optimization, a mathematical framework that models the constraint set as a curved space called a manifold. Unlike standard approaches that require complex computations across the entire manifold, our technique reduces the workload by updating the iterates on randomly selected smaller sections, known as random submanifolds. This approach cuts down the computational complexity of each iteration significantly.
We also provide theoretical guarantees of convergence, ensuring that our method reliably finds solutions. Extensive experiments confirm its effectiveness across diverse problems.
Link To Code: https://github.com/andyjm3/RSDM
Primary Area: Optimization->Non-Convex
Keywords: Riemannian manifold, Stiefel manifold, Efficient optimization, randomized submanifold
Submission Number: 8144
Loading