Learning to Defer on Anonymously Annotated Data

Cuong C. Nguyen; Thanh-Toan Do; Gustavo Carneiro

Learning to Defer on Anonymously Annotated Data

Cuong C. Nguyen, Thanh-Toan Do, Gustavo Carneiro

11 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: learning to defer, latent dirichlet allocation, expectation - maximisation

Abstract: Recent advancements in machine learning have prompted the development in human-machine cooperation to leverage the efficiency of machines and the reliability of human expertise. One such approach is *learning to defer* (L2D), where a model learns to selectively defer decision-making to humans based on their historical performance on labelled data. Traditional L2D methods require the same set of human experts in both training and deployment phase, so that the system can leverage their historical performance to allocate queries accordingly. This human-specific nature, however, renders inflexibility in dynamic real-world environments where expert availability can fluctuate due to leave, retirement, or the integration of new team members. To address this challenge, we propose leveraging anonymously-annotated datasets, which are commonly available in practice, to infer annotation patterns and cluster human annotators based on behavioural similarities. Building upon the clustering of human experts, we develop a variant L2D, known as L2D-Clusters, that defers queries to a cluster rather than a specific expert, with one expert from the cluster randomly selected to make the final decision. Empirical results show that our clustering aligns with known annotator behaviour and that L2D-Clusters performs comparably to expert-specific L2D, especially in onboarding scenarios with limited annotator-identified data.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 4070

Loading