Normalizing Flow Model for Policy Representation in Continuous Action Multi-agent Systems

Xiaobai Ma, Jayesh K. Gupta, Mykel J. Kochenderfer

2020 (modified: 03 Nov 2022)AAMAS 2020Readers: Everyone

Abstract: Neural networks that output the parameters of a diagonal Gaussian distribution are widely used in reinforcement learning tasks with continuous action spaces. They have had considerable success in single-agent domains and even in some multi-agent tasks. However, general multi-agent tasks often require mixed strategies whose distributions cannot be well approximated by Gaussians or their mixtures. This paper proposes an alternative for policy representation based on normalizing flows. This approach allows for greater flexibility in action distribution representation beyond mixture models. We demonstrate their advantage over standard methods on a set of imitation learning tasks modeling human driving behaviors in the presence of other drivers.

0 Replies