Zero-Shot Learning for Fast Optimization of Computation Graphs

Aditya Paliwal, Felix Gimeno, Vinod Nair, Yujia Li, Miles Lubin, Pushmeet Kohli, Oriol Vinyals

10 Jul 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: We present a deep reinforcement learning approach to minimizing the execution cost of neural network computation graphs in an optimizing compiler. Unlike earlier learning-based works that require expensive online training steps, we propose a “zero-shot learning” approach that trains an optimizer offline that successfully generalizes to previously unseen graphs. This allows our approach to produce high-quality execution decisions on real-world TensorFlow graphs in seconds instead of hours. We consider two optimization tasks for computation graphs: minimizing running time and peak memory usage. In comparison to an extensive set of baselines, our approach achieves significant improvements over classical and other learning-based methods on these two tasks.

0 Replies