DCRAC: Deep Conditioned Recurrent Actor-Critic for Multi-Objective Partially Observable Environments

Xiaodong Nian, Athirai Aravazhi Irissappane, Diederik M. Roijers

2020 (modified: 07 Feb 2024)AAMAS 2020Readers: Everyone

Abstract: In many decision-making problems, agents aim to balance multiple, possibly conflicting objectives. Existing research in deep reinforcement learning mainly focuses on fully-observable single-objective solutions. In this paper, we propose DCRAC, a deep reinforcement learning framework for solving partially-objective multi-objective problems. DCRAC follows a conditioned actor-critic approach in learning the optimal policy, where the network is conditioned on the weights, i.e, relative importance for the different objectives. To deal with longer action-observation histories, in the case of partially observable environments, we introduce DCRAC-M which uses memory networks to further enhance the reasoning ability of the agent. Experimental evaluation on benchmark problems show the superiority of our approach when compared to state-of-the-art.

0 Replies