Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers

Published: 22 Sept 2025, Last Modified: 22 Sept 2025WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, Transformers, Cognitive Models, Experience-Weighted Attraction, Attention, Vector Quantization
Abstract: Transformers for control propagate credit through content-based attention yet lack an explicit memory of action-specific outcomes. We introduce Experience-Weighted Attraction with Vector Quantization for Online Decision Transformers (EWA–VQ–ODT). The method maintains a per-action mental account of recent success and failure. Continuous actions are routed to a small vector-quantized codebook by direct grid lookup, and each code holds an attraction scalar updated online by a simple recency-weighted rule with global decay each step and additive reinforcement when the code is chosen. We add these attractions as a small bias to the attention columns for action tokens before the softmax, leaving the backbone and learning objective unchanged and requiring no counterfactual modeling. On standard continuous-control benchmarks, EWA–VQ–ODT improves average evaluation return and sample efficiency over an Online Decision Transformer (ODT) baseline. The module is lightweight, easy to integrate, and interpretable through per–code attraction traces. In addition, we provide two concise theoretical analyses that formalize the attraction dynamics and bound the effect of the bias on attention, supporting safe and stable integration into transformer policies.
Submission Number: 345
Loading