Limiting Network Bandwidth to Unleash Throughput for Serverless Systems

Published: 21 May 2025, Last Modified: 17 Jun 2025MLArchSys 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Presentation: Virtual
Keywords: Serverless computing, resource management, machine learning for systems
Presenter Full Name: Prasoon Sinha
TL;DR: SoloTune makes intelligent network bandwidth allocations per request using an analytical model and online learning to meet the performance needs of users while improving the throughput of serverless platforms.
Presenter Email: prasoon.sinha@utexas.edu
Abstract: Serverless computing relieves developers from the burden of managing resources for their cloud applications. However, commercial providers require users to set a memory limit for their serverless function and then proportionally allocate (i.e., couple) the other resource types (CPU, network bandwidth). A few works show the inefficiencies with coupling CPU and memory, and instead make independent allocations for the two resource types. However, despite right-sizing CPU and memory allocations, we empirically find that these systems fall short in meeting the desired throughput (function invocations per second or requests per second). We make a key observation that the throughput of serverless systems is limited due to network congestion. The root cause of this congestion is that existing systems ignore right-sizing network bandwidth for serverless functions, thereby increasing contention for this resource type. In this work, we study commonly deployed serverless functions and find that network bandwidth is crucial to meet performance needs: a function’s execution time can vary by 10× depending on the amount of network bandwidth allocated. However, our analysis reveals that determining the required amount of network bandwidth to allocate is challenging: it depends on multiple factors, including the number of allocated CPU cores, function semantics, and function inputs. To this end, we build SoloTune, a holistic resource management framework for serverless systems. SoloTune uses online learning to predict a function’s compute time and then estimates the required network bandwidth to meet SLOs using an analytical model. Our initial experiments reveal that by just making intelligent network bandwidth allocations, we can reduce SLO violations by 1.3× at high load compared to state-of-the-art solutions.
Presenter Bio: Prasoon is a 3rd year PhD student at UT Austin advised by Professor Neeraja J. Yadwadkar. He conducts research at the intersection of systems and machine learning, focusing on improving the performance, utilization, and sustainability of large-scale systems. His work spans a variety of different systems, from serverless computing to ML inference serving.
Paper Checklist Guidelines: I certify that all co-authors have validated the presented results and conclusions, and have read and commit to adhering to the Paper Checklist Guidelines, Call for Papers and Publication Ethics.
YouTube Link: https://www.youtube.com/watch?v=VcOzMOeUL28
YouTube Link Poster: N/A
Dataset Release: I certify that all co-authors commit to release the dataset and necessary scripts to reproduce the presented results.
Google Slides: https://docs.google.com/presentation/d/1hSRpXL2wcFVJp-MMBukTD-2tS1D1PU7xuq2k0RNXw1s/edit?usp=sharing
Poster: No
Workshop Registration: Yes, the presenter has registered for the workshop.
YouTube Link Short: Coming soon
Submission Number: 2
Loading