Abstract: Microservice architecture has become a popular architecture adopted by many cloud applications. However, identifying the root cause of a failure in microservice systems is still a challenging and time-consuming task. In recent years, researchers have introduced various causal inference-based root cause analysis methods to assist engineers in identifying the root causes. To gain a better understanding of the current status of causal inference-based root cause analysis techniques for microservice systems, we conduct a comprehensive evaluation of nine causal discovery methods and twenty-one root cause analysis methods. Our evaluation aims to understand both the effectiveness and efficiency of causal inference-based root cause analysis methods, as well as other factors that affect their performance. Our experimental results and analyses indicate that no method stands out in all situations; each method tends to either fall short in effectiveness, efficiency, or shows sensitivity to specific parameters. Notably, the performance of root cause analysis methods on synthetic datasets may not accurately reflect their performance in real systems. Indeed, there is still a large room for further improvement. Furthermore, we also suggest possible future work based on our findings.
Loading