Graph-Based Ensemble Learning for Enhanced Fault Localization in Microservices

Ruibo Chen, Fang Peng, Xin Ji, Nan Xiang, Yihua Lou, Kui Zhang, Yanjun Pu, Wenjun Wu

Published: 2024, Last Modified: 07 Jan 2026SMC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As microservices architectures become increasingly prevalent, they introduce significant operational challenges due to the complexities in service interactions and fault propagation. These architectures often conceal the origins of faults due to intricate inter-service communications, making fault localization both critical and challenging. Addressing these difficulties, this paper introduces a novel fault localization method that leverages synergies between domain prior knowledge, ensemble learning, and graph-based modeling. Our approach models microservices as a graph, with services as nodes and their interactions as edges, illuminating complex dependencies and enhancing the depth of data analysis. The method integrates expert knowledge with a unique blend of multi-class decision trees and strategy models derived from a knowledge base, enabling effective de-tection of diverse patterns and anomalies. Additionally, a meta-learner refines the outputs from base models using a weighted decision-making process, significantly improving the accuracy and robustness of fault detection. Compared to traditional models, including graph neural networks, our approach sub-stantially reduces model complexity and enhances adaptability to evolving service patterns. It demonstrates superior scalability and real-time processing capabilities, offering a robust solution to the challenges of fault localization in dynamic microservice environments.

External IDs:dblp:conf/smc/ChenPJXLZP024