Abstract: Log-based anomaly detection is critical in monitoring the operation of microservice systems and in the realtime reporting of system failures. Utilizing deep learning-based log anomaly detection methods facilitates effective detection of anomalies within logs. However, existing methods are greatly dependent on log parsers, and parsing errors can considerably affect downstream anomaly detection tasks. Additionally, methods that predict the next log event in a sequence are susceptible to the instability of sequences and the emergence of unseen logs as systems evolve, resulting in a higher false positive rate. In this paper, we propose a semi-supervised log anomaly detection framework based on retrieval-augmented generation (RAG). This framework conducts phased detection using both Log Tokens and Log Templates to mitigate the impact of log parsing errors. It also utilizes a single-class classifier to model the normal behavior of the system, thereby circumventing the effects of unstable sequences. Finally, it employs large language model (LLM) empowered by RAG to reevaluate detected anomalous logs.
Loading