Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

David Eriksson; Pierce I-Jen Chuang; Samuel Daulton; Peng Xia; Akshat Shrivastava; Arun Babu; Shicong Zhao; Ahmed A Aly; Ganesh Venkatesh; Maximilian Balandat

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

David Eriksson, Pierce I-Jen Chuang, Samuel Daulton, Peng Xia, Akshat Shrivastava, Arun Babu, Shicong Zhao, Ahmed A Aly, Ganesh Venkatesh, Maximilian Balandat

Published: 14 Jul 2021, Last Modified: 05 May 2023AutoML@ICML2021 PosterReaders: Everyone

Keywords: Bayesian Optimization, Gaussian Process, AutoML, Natural Language Understanding

TL;DR: We propose a high-dimensional multi-objective Bayesian optimization for tuning a natural language understanding model at Facebook.

Abstract: When tuning the architecture and hyperparameters of large machine learning models for on-device deployment, it is desirable to understand the optimal trade-offs between on-device latency and model accuracy. In this work, we leverage recent methodological advances in Bayesian optimization over high-dimensional search spaces and multi-objective Bayesian optimization to efficiently explore these trade-offs for a production-scale on-device natural language understanding model at Facebook.

Ethics Statement: The primary benefit of the proposed method is better optimization for high-dimensional multi-objective optimizations problems, which can lead to better machine learning models through more efficient tuning. In this work, we optimized both the model accuracy and on-device latency relative to a baseline solution which does not expose their absolute values. A potential risk associated with our method and black-box optimization in general is a potential over-reliance onfully automated computationally expensive tuning that may only lead to marginal improvements.

Crc Pdf: pdf

Poster Pdf: pdf

Original Version: pdf

4 Replies

Loading