Computational-Statistical Tradeoffs from NP-hardnes

Guy Blanc, Caleb Koch, Carmen Strassle, Li-Yang Tan

Published: 17 Jul 2025, Last Modified: 06 Apr 2026FOCS 2025EveryoneCC BY 4.0

Abstract: A central question in computer science and statistics is whether efficient algorithms can achieve the information-theoretic limits of statistical problems. Many computational-statistical tradeoffs have been shown under average-case assumptions, but since statistical problems are average-case in nature, it has been a challenge to base them on standard worst-case assumptions. In PAC learning where such tradeoffs were first studied, the question is whether computational efficiency can come at the cost of using more samples than information-theoretically necessary. We base such tradeoffs on 𝖭𝖯-hardness and obtain: ∘ Sharp computational-statistical tradeoffs assuming 𝖭𝖯 requires exponential time: For every polynomial p(n), there is an n-variate class C with VC dimension 1 such that the sample complexity of time-efficiently learning C is Θ(p(n)). ∘ A characterization of 𝖱𝖯 vs. 𝖭𝖯 in terms of learning: 𝖱𝖯=𝖭𝖯 iff every 𝖭𝖯-enumerable class is learnable with O(VCdim(C)) samples in polynomial time. The forward implication has been known since (Pitt and Valiant, 1988); we prove the reverse implication. Notably, all our lower bounds hold against improper learners. These are the first 𝖭𝖯-hardness results for improperly learning a subclass of polynomial-size circuits, circumventing formal barriers of Applebaum, Barak, and Xiao (2008).