SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

TMLR Paper4950 Authors

25 May 2025 (modified: 18 Oct 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Modern deep reinforcement learning (DRL) methods have made significant advances in handling continuous action spaces. However, real-world control systems--especially those requiring precise and reliable performance--often demand interpretability in the sense of a-priori assessments of agent behavior to identify safe or failure-prone interactions with environments. To address this limitation, we propose SALSA-RL (Stability Analysis in the Latent Space of Actions), a novel RL framework that models control actions as dynamic, time-dependent variables evolving within a latent space. By employing a pre-trained encoder-decoder and a state-dependent linear system, our approach enables interpretability through local stability analysis, where instantaneous growth in action-norms can be predicted before their execution. We demonstrate that SALSA-RL can be deployed in a non-invasive manner for assessing the local stability of actions from pretrained RL agents without compromising on performance across diverse benchmark environments. By enabling a more interpretable analysis of action generation, SALSA-RL provides a powerful tool for advancing the design, analysis, and theoretical understanding of RL systems.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We made changes in page 7 (highlighted in red), in response to Reviewer ZkYN’s comment on 19 Sept
Assigned Action Editor: ~Aleksandra_Faust1
Submission Number: 4950
Loading