Frank-Wolfe-based Algorithms for Approximating Tyler's M-estimatorDownload PDF

Published: 31 Oct 2022, 18:00, Last Modified: 11 Jan 2023, 13:23NeurIPS 2022 AcceptReaders: Everyone
Keywords: Frank-Wolfe, conditional gradient, projection-free, Tyler's M-estimator, robust covariance estimation, heavy tailed distributions, elliptical distributions, convex optimization, nonconvex optimization
TL;DR: First Frank-Wolfe algorithms for computing Tyler's M-estimator which converge to the optimal solution with (nearly) linear runtime per iteration and with linear rates
Abstract: Tyler's M-estimator is a well known procedure for robust and heavy-tailed covariance estimation. Tyler himself suggested an iterative fixed-point algorithm for computing his estimator however, it requires super-linear (in the size of the data) runtime per iteration, which maybe prohibitive in large scale. In this work we propose, to the best of our knowledge, the first Frank-Wolfe-based algorithms for computing Tyler's estimator. One variant uses standard Frank-Wolfe steps, the second also considers \textit{away-steps} (AFW), and the third is a \textit{geodesic} version of AFW (GAFW). AFW provably requires, up to a log factor, only linear time per iteration, while GAFW runs in linear time (up to a log factor) in a large $n$ (number of data-points) regime. All three variants are shown to provably converge to the optimal solution with sublinear rate, under standard assumptions, despite the fact that the underlying optimization problem is not convex nor smooth. Under an additional fairly mild assumption, that holds with probability 1 when the (normalized) data-points are i.i.d. samples from a continuous distribution supported on the entire unit sphere, AFW and GAFW are proved to converge with linear rates. Importantly, all three variants are parameter-free and use adaptive step-sizes.
Supplementary Material: pdf
20 Replies