Explaining Tree Model Decisions in Natural Language for Network Intrusion Detection

Published: 27 Oct 2023, Last Modified: 16 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX
TL;DR: Large Language Models can generate high quality natural language explanations of decision tree inference using background knowledge
Abstract: Network intrusion detection (NID) systems which leverage machine learning have been shown to have strong performance in practice when used to detect malicious network traffic. Decision trees in particular offer a strong balance between performance and simplicity, but require users of NID systems to have background knowledge in machine learning to interpret. In addition, they are unable to provide additional outside information as to why certain features may be important for classification. In this work, we explore the use of large language models (LLMs) to provide explanations and additional background knowledge for decision tree NID systems. Further, we introduce a new human evaluation framework for decision tree explanations, which leverages automatically generated quiz questions that measure human evaluators' understanding of decision tree inference. Finally, we show LLM generated decision tree explanations correlate highly with human ratings of readability, quality, and use of background knowledge while simultaneously providing better understanding of decision boundaries.
Submission Track: Full Paper Track
Application Domain: Natural Language Processing
Survey Question 1: Network intrusion detection systems monitor incoming and outgoing network traffic to detect potentially malicious activity. In practice, ML based network intrusion detection systems such as decision trees can often go untrusted as they require background knowledge in machine learning to interpret, and are not able to explain why certain features are important for classifying network traffic as malicious or not. In this work, we explore how to use large language models to generate explanations for decision tree inference of NID systems.
Survey Question 2: Explainability is important to security practitioners who use NID systems, as network traffic that is potentially malicious must be blocked immediately, but accidentally blocking benign network traffic is also very important to avoid, as it may cause entire systems to go down. Existing approaches such as LIME and SHAP offer additional insight into which features were most important for a particular inference, but these are often not enough for network administrators to make a final decision on whether to allow or block traffic, as they do not incorporate any background knowledge about the features and they still require a background in machine learning to interpret.
Survey Question 3: In this paper, we employ large language models to assist in explaining decision tree inferences for network intrusion detection.
Submission Number: 59