On The Difficulty of Learning in Classification Problems: Optimality and Information-Theoretic Perspectives
Keywords: Learning Theory, Information Theory, Supervised Learning, Classification Problems
TL;DR: This paper proposes an information-theoretic formulation of classification problems and studies an approximation of the difficulty of learning in such formulation.
Abstract: This paper studies the hardness of learning in classification tasks. We formulate a classification problem using a fixed input distribution and a variable ground-truth classifier drawn from a prior distribution, and consider an average notion of risk measure. We then derive a closed-form solution for the optimal learner and the optimal risk, and use the latter to measure the hardness of learning. Using Fano's Inequality, we establish a risk lower bound in terms of information-theoretic quantities. Our bound overcomes the over-pessimism of classical lower bounds in statistical learning theory. Comparing with existing information-theoretic lower-bounds in similar settings, our bound is tighter and more practically relevant. Our analysis reveals a tradeoff between two key quantities that govern the difficulty of learning in classification problems, which we refer to as $\textit{identifiability}$ and $\textit{agreement}$. We also characterize the convergence behavior of our lower bound with respect to the sample size.
Primary Area: learning theory
Submission Number: 2249
Loading