Legal-Gated Attention Networks: Enforcing Action Legality as a Structural Inductive Bias in Deep Reinforcement Learning

ICLR 2026 Conference Submission12657 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, Explainable AI, Attention Mechanisms, Gated Networks
Abstract: In reinforcement learning (RL) for environments with state-dependent action constraints, conventional methods suffer from conflated representations, as signals from infeasible actions introduce noise and complicate the learning task. While post-hoc masking is a common workaround, it fails to prevent this contamination at a fundamental level, as illegal actions still influence the learned representations. To address this, we propose Legal-Gated Attention Networks (LGAN), an architecture that introduces a strong structural inductive bias by compiling action legality directly into its structure. LGAN fundamentally alters self-attention by using a legality mask to gate the query formation process itself, permitting only legal actions to attend to the state. This architectural design guarantees by construction that illegal actions are structurally eliminated: they produce no queries, receive no gradients, and cannot influence policy or value updates. By using raw state vectors as values, LGAN's attention weights directly reveal which state components contribute to each legal action's value. We demonstrate in board games that this structurally grounded approach provides an effective framework for learning transparent policies, positioning LGAN as a principled method for building robust and interpretable agents in action-constrained environments.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 12657
Loading