TL;DR: We show that the learnability of agents that can improve is not characterized by the VC dimension, analyze the sample complexity of learning several fundamental concept classes in this setting and provide an empirical study on real datasets.
Abstract: One of the most basic lower bounds in machine learning is that in nearly any nontrivial setting, it takes at least $1/\epsilon$ samples to learn to error $\epsilon$ (and more, if the classifier being learned is complex). However, suppose that data points are agents who have the ability to improve by a small amount if doing so will allow them to receive a (desired) positive classification. In that case, we may actually be able to achieve zero error by just being "close enough". For example, imagine a hiring test used to measure an agent's skill at some job such that for some threshold $\theta$, agents who score above $\theta$ will be successful and those who score below $\theta$ will not (i.e., learning a threshold on the line). Suppose also that by putting in effort, agents can improve their skill level by some small amount $r$. In that case, if we learn an approximation $\hat{\theta}$ of $\theta$ such that $\theta \leq \hat{\theta} \leq \theta + r$ and use it for hiring, we can actually achieve error zero, in the sense that (a) any agent classified as positive is truly qualified, and (b) any agent who truly is qualified can be classified as positive by putting in effort. Thus, the ability for agents to improve has the potential to allow for a goal one could not hope to achieve in standard models, namely zero error.\
In this paper, we explore this phenomenon more broadly, giving general results and examining under what conditions the ability of agents to improve can allow for a reduction in the sample complexity of learning, or alternatively, can make learning harder. We also examine both theoretically and empirically what kinds of improvement-aware algorithms can take into account agents who have the ability to improve to a limited extent when it is in their interest to do so.
Lay Summary: In settings where individuals are being judged, such as taking a test to certify competence at some task or admission to a desirable program, individuals often will invest effort to increase the chance of a favorable outcome in response to the published qualification criteria. For example, knowledge of the cutoff score for passing a test would impact the amount that individuals study for it, or a loan applicant may take a money management course if they know it will be used in determining whether they get the loan.
This capacity for improvement can impact the design and accuracy of decision-making algorithms. As a simple example, if every individual can truthfully change themselves to match any positive example, then a single positive example is sufficient for an algorithm to achieve zero classification error.
Our central question is: How does people's capacity for improvement within a limited amount affect the design of accurate decision-making algorithms? To address this, we examine the conditions under which people's ability to improve can reduce or increase the minimum number of training examples required to learn a function within a desired error margin and confidence level (sample complexity) and theoretically and empirically investigate which algorithms can effectively incorporate this behavior.
Link To Code: https://github.com/ripl/PLI
Primary Area: Theory->Learning Theory
Keywords: agents improvement, PAC learning, sample complexity, learning from graphs
Submission Number: 11099
Loading