Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

Yinwei Dai, Rui Pan, Anand Iyer, Kai Li, Ravi Netravali

Published: 04 Nov 2024, Last Modified: 30 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Loading