Abstract: Multi-server jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multi-server jobs. In this paper, we consider queueing models for multi-server jobs in a scaling regime where the number of servers in the system becomes large. Prior work has derived upper bounds on the queueing probability in this scaling regime. But without proper lower bounds, the results cannot be used to differentiate between policies. We focus on the mean queueing time of multi-server jobs, and establish both upper and lower bounds under various scheduling policies. Our results show that a Priority policy achieves order optimality for minimizing mean queueing time, and the Priority policy is strictly better than the First-Come-First-Serve policy.
0 Replies
Loading