Location-Aware Dynamic Scaling of Microservices in Mobile Edge Computing

Published: 2025, Last Modified: 22 Jan 2026IEEE Trans. Netw. Serv. Manag. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The latency of cloud-hosted composite applications increases due to extended transmission time from the centralized cloud to end-users, compromising service quality. Typical AI application scenarios like autonomous driving and smart cities demand low network latency. Edge computing addresses this by enabling data collection and analysis in nearby edge data centers, reducing user response time. However, in a complex edge-cloud computing environment, finding the optimal scaling scheme dynamically is crucial due to varying user response times and dynamic scaling costs near edge data centers. This paper proposes a predictive scaling method to adjust microservice container number based on user request fluctuations. Our prediction algorithm, a two-way GRU with an attention mechanism named A-Bi-GRU, aims to minimize scaling jitter. To achieve this, we introduce the concept of an observation window and employ a multi-objective optimization algorithm based on improved NSGA-II, named DP-GA, for microservice scaling across different locations within each window. The solution aims to minimize the average user response time and scaling costs, enabling intelligent dynamic scaling based on location awareness. Experimental results indicate that the proposed A-Bi-GRU forecasting algorithm achieves approximately a 30% improvement in prediction accuracy over traditional linear models such as LR and SVM, and about a 5–10% improvement compared to conventional recurrent neural networks like RNN and LSTM. Furthermore, the proposed DP-GA multi-objective optimization algorithm reduces average response time by roughly 80% and scaling cost by approximately 50%.
Loading