Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning

Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning

ICLR 2026 Conference Submission17491 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Adversarial Robustness, State-Adversarial MDPs, Neural Network Pruning, Robust Policy Optimization, Lipschitz Bounds, Certified Robustness, Robustness–Performance Trade-off, Long-Horizon Stability, Magnitude Pruning, Micro-Pruning, Policy Compression, Robust Control, Robust Deep RL

TL;DR: Pruning deep RL policies not only compresses them but also provably improves robustness to state-adversarial attacks, revealing a trade-off where moderate sparsity smooths behavior and balances clean performance with resilience.

Abstract: Reinforcement learning (RL) policies deployed in real-world environments must remain reliable under adversarial perturbations. At the same time, modern deep RL agents are heavily overparameterized, raising costs and fragility concerns. While pruning has been shown to improve robustness in supervised learning, its role in adversarial RL remains poorly understood. We develop the first theoretical framework for \emph{certified robustness under pruning} in state-adversarial Markov decision processes (SA-MDPs). For Gaussian and categorical policies with Lipschitz networks, we prove that elementwise pruning can only tighten certified robustness bounds; pruning never makes the policy less robust. Building on this, we derive a novel three-term regret decomposition that disentangles clean-task performance, pruning-induced performance loss, and robustness gains, exposing a fundamental performance--robustness frontier. Empirically, we evaluate magnitude and micro-pruning schedules on continuous-control benchmarks with strong policy-aware adversaries. Across tasks, pruning consistently uncovers reproducible ``sweet spots'' at moderate sparsity levels, where robustness improves substantially without harming---and sometimes even enhancing---clean performance. These results position pruning not merely as a compression tool but as a structural intervention for robust RL.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 17491

Loading