FastPERT: towards fast microservice application latency prediction via structural inductive bias over PERT networks
Abstract: The recent surge in popularity of cloud-native applications
using microservice architectures has led to a focus on accurate
end-to-end latency prediction for proactive resource allocation.
Existing models leverage Graph Transformers to Microservice
Call Graphs or the Program Evaluation and Review
Technique (PERT) graphs to capture complex temporal
dependencies between microservices. However, these models
incur a high computational cost during both training and inference
phases. This paper introduces FastPERT, an efficient
model for predicting end-to-end latency in microservice applications.
FastPERT dissects an execution trace into several
microservices tasks, using observations from prior execution
traces of the application, akin to the PERT approach. Subsequently,
a prediction model is constructed to estimate the
completion time for each individual task. This information,
coupled with the computational and structural inductive bias
of the PERT graph, facilitates the efficient computation of the
end-to-end latency of an execution trace. As a result, Fast-
PERT can efficiently capture the complex temporal causality
of different microservice tasks, leading to more accurate and
robust latency predictions across a variety of applications.
An evaluation based on datasets generated from large-scale
Alibaba microservice traces reveals that FastPERT significantly
improves training and inference efficiency without
compromising performance, demonstrating its potential as a
superior solution for real-time end-to-end latency prediction
in cloud-native microservice applications.
Loading