Informed Asymmetric Actor-Critic: Theoretical Insights and Open Questions

Daniel Ebi; Gaspard Lambrechts; Damien Ernst; Klemens Böhm

Informed Asymmetric Actor-Critic: Theoretical Insights and Open Questions

Daniel Ebi, Gaspard Lambrechts, Damien Ernst, Klemens Böhm

Published: 17 Jul 2025, Last Modified: 07 Oct 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: asymmetric actor-critic, partial observability, POMDP, recurrent natural policy gradient, privileged information, asymmetric RL

TL;DR: We present an asymmetric actor-critic method for partially observable environments that leverages arbitrary privileged information, without requiring full-state access, while preserving unbiased policy gradient estimates.

Abstract: Reinforcement learning in partially observable environments requires agents to make decisions under uncertainty, based on incomplete and noisy observations. Asymmetric actor-critic methods improve learning in these settings by exploiting privileged information available during training. Most existing approaches, however, assume full access to the true state. In this work, we present a novel asymmetric actor-critic formulation grounded in informed partially observable Markov decision processes, allowing the critic to leverage arbitrary privileged information without requiring full-state access. We show that the method preserves the policy gradient theorem and yields unbiased gradient estimates even when the critic conditions on privileged partial information. Furthermore, we provide a theoretical analysis of the informed asymmetric recurrent natural policy gradient algorithm derived from our informed asymmetric learning paradigm. Our findings challenge the assumption that full-state access is necessary for unbiased policy learning, motivating the need to develop well-defined criteria to quantify the informativeness of additional training signals and opening new directions for asymmetric reinforcement learning.

Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.

Serve As Reviewer: ~Gaspard_Lambrechts1, ~Daniel_Ebi1

Track: Regular Track: unpublished work

Submission Number: 162

Loading