Learning to Ground Multi-Agent Communication with Autoencoders

Anonymous Author(s) Paper ID 6520




Supplemental Results


The following demo videos compare ae-comm with ae-rl-comm, no-comm, and rl-comm agents in the MarlGrid environments.

Please use the dropdown menu to select from a list of 10 examples for each environment.


RedBlueDoors

A reward of 1 is given to both agents if and only if the red door is opened first and then the blue door. If the blue door is opened first, no reward is given and episode ends immediately.






FindGoal

Each agent receives a reward of 1 when they reach the goal, and an additional reward of 1 when all 3 agents reach the goal within the maximum episode length.