Multi-Agent Compositional Communication Learning from Raw Visual Input

Anonymous

Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: One of the distinguishing natures of human language is its compositionality, which allows us to describe the complex environment with limited vocabulary. Previously, it has been shown that neural network agents can learn to communicate in compositional language based on disentangled input (i.e. hand-engineered features). Humans, however, perceive the environment in raw signals, not well-summarized features. In this work, we train neural network agents to communicate in compositional language based only on raw image pixels. The agents play an image description game where the image contains compositional factors such as colors and shapes. We train the agents using the obverter technique where an agent introspects to generate a message that best makes sense to itself. During the zero-shot test, agents could communicate with the accuracy as high as 97% even for objects not seen during the training, providing evidence that neural network agents are capable of learning compositional language even from raw image pixels.
  • TL;DR: We train neural network agents to communicate in compositional language based on raw pixel input.
  • Keywords: compositional language, multi-agent communication, raw pixel input

Loading