In-Context Learning for Latency Estimation

Published: 12 Jul 2024, Last Modified: 21 Aug 2024AutoML 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: In-context Learning, Latency estimation, Neural Architecture Search (NAS), Prior-Data Fitted Networks, Hardware-aware NAS
TL;DR: In-context learning for latency estimation improving the quality of predictions and sample efficiency
Abstract: Neural architectures must be computationally efficient for edge device deployment to perform well under hardware constraints. Current Hardware-Aware NAS (HW-NAS) methods use surrogate models to predict hardware metrics (e.g., latency) during architecture search. These surrogate models typically require large amounts of data to train or finetune and rely on search-space specific encodings and meta-learning. We propose an In-Context Learning-based method for hardware latency estimation that generalises to unseen hardware in a single forward pass with a few labelled samples. Our surrogate is trained on real architecture-latency pairs with data augmentation to improve sample efficiency. Our method surpasses the state-of-the-art in Spearman's $\rho$ and is up to 72\% to 92\% more sample efficient.
Submission Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Optional Meta-Data For Green-AutoML: All questions below on environmental impact are optional.
CPU Hours: 360
GPU Hours: 360
TPU Hours: 0
Evaluation Metrics: No
Submission Number: 30
Loading