Keywords: natural language, reinforcement learning, structured prediction, multi-agent learning, deep learning
TL;DR: We use reinforcement learning for query reformulation on two tasks and surprisingly find that when training multiple agents diversity of the reformulations is more important than specialisation.
Abstract: We investigate methods to efficiently learn diverse strategies in reinforcement learning for a generative structured prediction problem: query reformulation. In the proposed framework an agent consists of multiple specialized sub-agents and a meta-agent that learns to aggregate the answers from sub-agents to produce a final answer. Sub-agents are trained on disjoint partitions of the training data, while the meta-agent is trained on the full training set. Our method makes learning faster, because it is highly parallelizable, and has better generalization performance than strong baselines, such as an ensemble of agents trained on the full data. We evaluate on the tasks of document retrieval and question answering. The improved performance seems due to the increased diversity of reformulation strategies. This suggests that multi-agent, hierarchical approaches might play an important role in structured prediction tasks of this kind. However, we also find that it is not obvious how to characterize diversity in this context, and a first attempt based on clustering did not produce good results. Furthermore, reinforcement learning for the reformulation task is hard in high-performance regimes. At best, it only marginally improves over the state of the art, which highlights the complexity of training models in this framework for end-to-end language understanding problems.