Abstract: Log-based anomaly detection (LAD) is one of the dominant approaches to improving the reliability and security of software systems. Presently, despite the efficacy demonstrated by state-of-the-art LAD approaches in processing static log events, their performance significantly degrades when confronting changes of log event types from system updates. To construct a reliable LAD model that could adapt well to the evolution of log data, we propose a method grounded in semi-supervised domain adaptation on the rationale of incremental log anomaly detection dubbed as SSDALog, which dynamically updates the model utilizing limited labeled samples to reconcile distributional shifts between evolving and historical data. Specifically, the proposed approach addresses the issue through two primary mechanisms: (i) creation of a cross-domain mixup algorithm, which computes the feature salience of log discrete sequences through occlusion strategy, thus enhancing the adaptability of the model to unknown patterns by mixing evolving features; and (ii) design of an incremental semi-supervised domain adaptation training framework based on noisy label learning to obtain a robust feature extractor, thus improving the generalization ability of the detection model. We empirically assess the efficacy of the SSDALog approach across two publicly available datasets. The experimental results show that our method outperforms the SOTA LAD approach, particularly for evolving systems.
External IDs:dblp:journals/tifs/TianLCWNQ25
Loading