A Hierarchical Model for Device Placement

Azalia Mirhoseini; Anna Goldie; Hieu Pham; Benoit Steiner; Quoc V. Le; Jeff Dean

A Hierarchical Model for Device Placement

Azalia Mirhoseini, Anna Goldie, Hieu Pham, Benoit Steiner, Quoc V. Le, Jeff Dean

15 Feb 2018 (modified: 24 Feb 2018)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: We introduce a hierarchical model for efficient placement of computational graphs onto hardware devices, especially in heterogeneous environments with a mixture of CPUs, GPUs, and other computational devices. Our method learns to assign graph operations to groups and to allocate those groups to available devices. The grouping and device allocations are learned jointly. The proposed method is trained with policy gradient and requires no human intervention. Experiments with widely-used computer vision and natural language models show that our algorithm can find optimized, non-trivial placements for TensorFlow computational graphs with over 80,000 operations. In addition, our approach outperforms placements by human experts as well as a previous state-of-the-art placement method based on deep reinforcement learning. Our method achieves runtime reductions of up to 60.6% per training step when applied to models such as Neural Machine Translation.

TL;DR: We introduce a hierarchical model for efficient, end-to-end placement of computational graphs onto hardware devices.

Keywords: deep learning, device placement, policy gradient optimization

11 Replies

Loading