High-precision Online Log Parsing with Large Language Models

Published: 01 Jan 2024, Last Modified: 20 May 2025ICSE Companion 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: System logs are vital for diagnosing system failures, with log parsing converting unstructured logs into structured data. Existing methods fall into two categories: non-deep-learning approaches cluster logs based on stats but often miss semantic information, resulting in poor performance. Deep-learning approaches excel at identifying variables and constants but often lack generalizability beyond training data. And they always suffer from low efficiency. This paper proposes a novel LLM-based log parsing approach, named Hooglle, to address these challenges. Leveraging a large language model, Hooglle extracts templates for precise and generalized parsing. To overcome the efficiency issue, we propose a prefix-tree-based full-matching strategy which significantly improves parsing efficiency. Extensive evaluation across real-world datasets showcases Hooglle's superior performance on 16 public benchmark datasets.
Loading