The MLE is minimax optimal for LGC

Doron Cohen, Aryeh Kontorovich, Roi Weiss

Published: 01 Jan 2024, Last Modified: 27 Jan 2025CoRR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We revisit the recently introduced Local Glivenko-Cantelli setting, which studies distribution-dependent uniform convegence rates of the Maximum Likelihood Estimator (MLE). In this work, we investigate generalizations of this setting where arbitrary estimators are allowed rather than just the MLE. Can a strictly larger class of measures be learned? Can better risk decay rates be obtained? We provide exhaustive answers to these questions -- which are both negative, provided the learner is barred from exploiting some infinite-dimensional pathologies. On the other hand, allowing such exploits does lead to a strictly larger class of learnable measures.