Uncertainty-Aware Policy-Preserving Abstractions with Abstention for One-Shot Decisions

Published: 23 Sept 2025, Last Modified: 01 Dec 2025ARLETEveryoneRevisionsBibTeXCC BY 4.0
Track: Ideas, Open Problems and Positions Track
Keywords: Margin-based abstractions, Policy-preserving abstractions, Decision margins, Abstention-aware systems, Safety-critical decision-making, Beyond optimization
Abstract: State abstractions in reinforcement learning traditionally preserve optimal policies but discard information about decision confidence, preventing principled abstention when choices are uncertain. We propose that optimization—the dominant paradigm in RL—may itself be an inadequate lens for safety-critical deployments where the challenge is not achieving optimal performance, but recognizing boundaries where confident action risks unsafe amplification of uncertainty. We advocate for uncertainty-aware, policy-preserving abstractions: represent states not just by which actions are optimal, but by how strongly they dominate alternatives—their decision margins. When margins are small, systems should abstain rather than commit to weakly-supported choices. This integrates abstention directly into the state representation rather than adding confidence checks as an afterthought. This reframing reveals three fundamental challenges that resist traditional optimization: computational—when can we efficiently compute margins in large action spaces, and what structure makes this tractable? statistical—how do we learn which states require which margins without access to true rewards or deferral costs? epistemic—how do we set abstention thresholds when the costs of errors versus deferrals are fundamentally uncertain or contested? We argue these aren't merely technical gaps but potential boundaries where optimization itself becomes problematic. This position paper formalizes the margin-based framework and invites both theorists and experimentalists to investigate whether these challenges represent surmountable technical barriers or fundamental limits requiring new theoretical foundations beyond optimization.
Submission Number: 97
Loading