Modeling and Detecting Anomalous Topic Access

Siddharth Gupta, Carl A. Gunter, Bradley A. Malin, Mario Frank, David Liebovitz

Published: 04 Jun 2013, Last Modified: 06 Oct 20252013 IEEE International Conference on Intelligence and Security InformaticsEveryoneRevisionsCC BY 4.0

Abstract: There has been considerable success in developing strategies to detect insider threats in information systems based on what one might call the random object access model or ROA. This approach models illegitimate users as ones who randomly access records. The goal is to use statistics, machine learning, knowledge of workflows and other techniques to support an anomaly detection framework that finds such users. In this paper we introduce and study a random topic access model or RTA aimed at users whose access may be illegitimate but is not fully random because it is focused on common semantic themes. We argue that this model is appropriate for a meaningful range of attacks and develop a system based on topic summarization that is able to formalize the model and provide anomalous user detection effectively for it. To this end, we use healthcare as an example and propose a framework for evaluating the ability to recognize various types of random users called random topic access detection or RTAD. Specifically, we utilize a combination of Latent Dirichlet Allocation (LDA), for feature extraction, a k-nearest neighbor (k- NN) algorithm for outlier detection and evaluate the ability to identify different adversarial types. We validate the technique in the context of hospital audit logs where we show varying degrees of success based on user roles and the anticipated characteristics of attackers. In particular, it was found that RTAD exhibits strong performance for roles are described by a few topics, but weaker performance when users are more topic-agnostic