Searching for the Best Polynomial Approximation for the Accurate Log Matrix Normalization in Global Covariance Pooling

20 Sept 2025 (modified: 26 Feb 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Global Covaraince Pooling, Matrix Normalization, Polynomial Approximation
Abstract: Global Covariance Pooling (GCP) has significantly improved Deep Convolutional Neural Networks (DCNNs) by leveraging richer second-order statistics. However, since covariance matrices lie on the Symmetric Positive Definite (SPD) domain, normalization is required to map them back into the Euclidean domain. The mathematically accurate approach, Matrix Log Normalization (MLN), suffers from gradient instabilities and requires eigendecomposition (EIG) or singular value decomposition (SVD), both of which are GPU-unfriendly. To address these instabilities, Matrix Power Normalization (MPN) introduced square-root normalization. Since then, most works have focused on approximating the matrix square root, typically via Newton–Schulz iterations or polynomial (Taylor and Padé) expansions, as these are GPU-friendly. Yet no prior work has attempted to approximate the more accurate MLN using polynomials, despite their inherent GPU efficiency. In this work, we explore a broad range of polynomial families—especially orthogonal polynomials (Taylor, Chebyshev, Legendre, Laguerre, Padé)—for approximating MLN, and conclude that Chebyshev polynomials offer the most accurate and efficient approximation. Experiments on large-scale visual recognition benchmarks demonstrate that our approach achieves competitive accuracy while substantially reducing training cost. For reproducibility, the code will be released upon acceptance.
Primary Area: learning theory
Submission Number: 23447
Loading