ScalaLog: Scalable Log-Based Failure Diagnosis Using LLM

Published: 2025, Last Modified: 06 Jan 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As Industrial Internet of Things (IIoT) software systems become increasingly complex, precise failure diagnosis has become both essential and challenging. Current log-based failure diagnosis methods lack scalability for different failure types. In IIoT software systems, the number of failure types is constantly growing, and retraining the model each time a new failure type is introduced is highly resource-intensive. Additionally, traditional log-based failure diagnosis models often require log parsing as a preliminary step, which can also be resource-consuming. To address these challenges, we propose a scalable log-based failure diagnosis method named ScalaLog. ScalaLog builds on RAG by utilizing LLM-based summarization to extract key log information, applying sample augmentation to increase the number of samples, and using CoT prompts to guide the LLM in failure diagnosis. Experiments on various public and real-world datasets demonstrate that ScalaLog significantly enhances failure diagnosis accuracy without the need for training or log parsing.
Loading