Keywords: Knowledge graphs, Log analysis, Log vocabularies, Graph modelling patterns
Abstract: Log files are a vital source of information for keeping systems running and healthy. However, analyzing raw log data, i.e., textual records of system events, typically involves tedious searching for and inspecting clues, as well as tracing and correlating them across log sources. Existing log management solutions ease this process with efficient data collection, storage and normalization mechanisms, but identifying and linking entities across log sources and enriching them with background knowledge is largely an unresolved challenge. To facilitate a knowledge-based approach to log analysis, this paper introduces SLOGERT, a flexible framework and workflow for automated construction of knowledge graphs from arbitrary raw log messages. At its core, it automatically identifies rich RDF graph modelling patterns to represent types of events and extracted parameters that appear in a log stream. We present the workflow, the developed vocabularies for log integration, and our prototypical implementation. To demonstrate the viability of this approach, we conduct a performance analysis and illustrate its application on a large public log dataset in the security domain.
First Author Is Student: No