Sequential addition of coded sub-tasks for straggler mitigation

Ajay Badita, Parimal Parag, Vaneet Aggarwal

Published: 2020, Last Modified: 11 Sept 2023INFOCOM 2020Readers: Everyone

Abstract: Straggler mitigation can be achieved by redundant computation. In MDS redundancy method, a task is divided into k sub-tasks which are encoded to n coded sub-tasks, such that a task is completed if any k coded sub-tasks are completed. Two important metrics of interest are task completion time, and server utilization cost which is the aggregate completed work by all servers in this duration. We consider a proactive straggler mitigation strategy where n 0 out of n coded sub-tasks are started at time 0 while the remaining n - n 0 coded sub-tasks are launched when ℓ 0 ≤ min(n 0 , k) of the initial ones finish. The coded sub-tasks are halted when k of them finish. For this flexible forking strategy with multiple parameters, we analyze the mean of two performance metrics for the proposed forking strategy when the random service completion time at each server is independent and distributed identically to a shifted exponential. Our analysis demonstrates that the regime of n 0 <; k leads to higher mean service completion time and no change in mean server utilization cost as compared to no forking (n 0 = n), and is thus not a regime of interest. For n 0 ≥ k, we find that there is a tradeoff between the two performance metrics and leads to decrease in mean server utilization cost at the expense of mean service completion time and an efficient choice of the parameters is helpful.

0 Replies