Keywords: Learning to defer
Abstract: Learning to defer (L2D) enables human-AI cooperation by determining when AI systems should make autonomous predictions versus deferring to human experts. However, existing L2D methods assume constant human performance across both short and long time horizons, contradicting established cognitive psychology research on fatigue-induced performance degradation. We present Fatigue-Aware Learning to Defer via Constrained Optimisation (FALCON), explicitly modelling {workload-varying} human performance through psychologically grounded fatigue curves. FALCON formulates L2D as a Constrained Markov Decision Process (CMDP), where system states incorporate both task-specific characteristics and cumulative human workload. In particular, we maximise classification accuracy under human-AI cooperation budget constraints, using PPO-Lagrangian optimisation. We also introduce the Fatigue-Aware L2D (FA-L2D) benchmark with controllable fatigue-induced performance degradation across varying time horizons, enabling scenarios that range from near-constant to highly variable human performance and replacing prior benchmarks that assumed stability over time. Extensive experiments on our benchmarks demonstrate that FALCON consistently outperforms state-of-the-art L2D approaches at all coverage levels, particularly when considering human performance variations. Notably, FALCON enables zero-shot generalisation to unseen experts with different fatigue patterns. Furthermore, L2D methods are shown to consistently surpass both AI-only and human-only baselines whenever coverage lies strictly between 0 and 1, underscoring the effectiveness of adaptive human–AI collaboration in a setting closer to real-world scenarios.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 666
Loading