LogRAG: Semi-Supervised Log-based Anomaly Detection with Retrieval-Augmented Generation

Wanhao Zhang, Qianli Zhang, Enyu Yu, Yuxiang Ren, Yeqing Meng, Mingxi Qiu, Jilong Wang

Published: 2024, Last Modified: 20 May 2025ICWS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Log-based anomaly detection is critical in monitoring the operation of microservice systems and in the realtime reporting of system failures. Utilizing deep learning-based log anomaly detection methods facilitates effective detection of anomalies within logs. However, existing methods are greatly dependent on log parsers, and parsing errors can considerably affect downstream anomaly detection tasks. Additionally, methods that predict the next log event in a sequence are susceptible to the instability of sequences and the emergence of unseen logs as systems evolve, resulting in a higher false positive rate. In this paper, we propose a semi-supervised log anomaly detection framework based on retrieval-augmented generation (RAG). This framework conducts phased detection using both Log Tokens and Log Templates to mitigate the impact of log parsing errors. It also utilizes a single-class classifier to model the normal behavior of the system, thereby circumventing the effects of unstable sequences. Finally, it employs large language model (LLM) empowered by RAG to reevaluate detected anomalous logs.