Triple:The Interpretable Deep Learning Anomaly Detection Framework based on Trace-Metric-Log of Microservice

Published: 2023, Last Modified: 06 Feb 2025IWQoS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Existing anomaly detection approaches based on deep-learning just could simultaneously dig out key information from two dimensions in the traces, metrics or logs. Besides, they just output simple binary result, which ignores the key artificial statement information in the log. In this paper, we propose Triple, an interpretable anomaly detection approach based on deep learning for microservice system. More importantly, Triple aims to help engineers to establish trust in the system decision from key metrics and the artificial statements in logs. Triple leverages graph representation to describe the complicated dependency relationship in the traces with the logs and metrics embedded into the node features. Based on the graph representation, Triple trains a Spatial-Temporal Graph Convolutional Network(STGCN) to capture the key information and generate decision boundary by deep SVDD, which detects the system's anomaly. In addition, we design an interpreter to transfer the simple binary result into a humanly understandable result, including log, metrics and trace, to facilitate engineers' understanding and handling of the incoming incident. Our work has four aims. First, to the best of our knowledge, we are the first to simultaneously apply three data sources to finish anomaly detection in the domain. Second, we design a new anomaly detection method that is an STGCN based on SVDD. Third, we design an interpreter that makes the decision not only a simple binary result. The interpretable result could capture the key artificial statement information in the log and assist engineers in incident troubleshooting. Finally, we design a series experiments to validate our method's effectiveness in the real-world system's dataset. Our results show that Triple consistently achieves improvements over other state-of-the-art models by 11%-65%.
Loading