Abstract: Strategic classification addresses a learning problem where a decision-maker implements a classifier over agents who may manipulate their features in order to receive favorable predictions. In the standard model of online strategic classification, in each round, the decision-maker implements and publicly reveals a classifier, after which agents perfectly best respond based on this knowledge. However, in practice, whether to disclose the classifier is often debated---some decision-makers believe that hiding the classifier can prevent misclassification errors caused by manipulation. In this paper, we formally examine how limiting the agents' access to the current classifier affects the decision-maker's performance. Specifically, we consider an extended online strategic classification setting where agents lack direct knowledge about the current classifier and instead manipulate based on a weighted average of historically implemented classifiers. Our main result shows that in this setting, the decision-maker incurs $(1-\gamma)^{-1}$ or $k_{\text{in}}$ times more mistakes compared to the full-knowledge setting, where $k_{\text{in}}$ is the maximum in-degree of the manipulation graph (representing how many distinct feature vectors can be manipulated to appear as a single one), and $\gamma$ is the discount factor indicating agents' memory of past classifiers. Our results demonstrate how withholding access to the classifier can backfire and degrade the decision-maker's performance in online strategic classification.
Lay Summary: Algorithmic decision-making is increasingly used in domains such as job hiring, loan approvals, and college admissions, where individuals may strategically manipulate their inputs to receive favorable outcomes. This setting is captured by the problem of strategic classification, where the challenge is to make accurate predictions based on potentially manipulated data. A common belief is that hiding the classifier may reduce manipulation and improve performance, but whether transparency helps or hurts remains unclear.
This work formally compares the decision-maker’s performance under transparent and non-transparent settings. In the non-transparent case, individuals still have incentives to manipulate, but their responses are based on past classifiers rather than the current one. We show that, contrary to conventional wisdom, limited transparency can substantially increase misclassification errors.
Our findings demonstrate that withholding classifiers may backfire and degrade the decision-maker's performance in online strategic classification.
Primary Area: Theory->Game Theory
Keywords: strategic classification, mistake bound, online learning
Submission Number: 4090
Loading