Phase transitions for the existence of unregularized M-estimators in single index models

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: This paper studies phase transitions for the existence of unregularized M-estimators in single-index models under proportional asymptotic regime.
Abstract: This paper studies phase transitions for the existence of unregularized M-estimators under proportional asymptotics where the sample size $n$ and feature dimension $p$ grow proportionally with $n/p \to \delta \in (1, \infty)$. We study the existence of M-estimators in single-index models where the response $y_i$ depends on covariates $x_i \sim N(0, I_p)$ through an unknown index ${w} \in \mathbb{R}^p$ and an unknown link function. An explicit expression is derived for the critical threshold $\delta_\infty$ that determines the phase transition for the existence of the M-estimator, generalizing the results of Candés & Sur (2020) for binary logistic regression to other single-index models. Furthermore, we investigate the existence of a solution to the nonlinear system of equations governing the asymptotic behavior of the M-estimator when it exists. The existence of solution to this system for $\delta > \delta_\infty$ remains largely unproven outside the global null in binary logistic regression. We address this gap with a proof that the system admits a solution if and only if $\delta > \delta_\infty$, providing a comprehensive theoretical foundation for proportional asymptotic results that require as a prerequisite the existence of a solution to the system.
Lay Summary: This paper explores when certain statistical tools, called M-estimators, can be reliably used in high-dimensional settings—specifically, when the number of data points $n$ and the number of variables $p$ increase at the same rate. We focus on models where the outcome depends on many variables in an unknown way. We identify a precise threshold of the ratio $n/p$ that determines whether an M-estimator exists, extending previous results from specific cases like logistic regression to a broader class of models. We also prove a key mathematical result: a system of equations describing the estimator’s asymptotic behavior has a solution if and only if this threshold is exceeded. This resolves an open question and provides a stronger foundation for studying statistical methods in high dimensions.
Primary Area: Theory->Everything Else
Keywords: proportional asymptotics, CGMT, conic geometry, convex analysis in Hilbert space
Submission Number: 4101
Loading