Explaining high-dimensional text classifiers

Odelia Melamed; Rich Caruana

Explaining high-dimensional text classifiers

Odelia Melamed, Rich Caruana

Published: 27 Oct 2023, Last Modified: 22 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX

TL;DR: A novel method for explanations in high dimensional textual data and no additional special training.

Abstract: Explainability has become a valuable tool in the last few years, helping humans better understand AI-guided decisions. However, the classic explainability tools are sometimes quite limited when considering high-dimensional inputs and neural network classifiers. We present a new explainability method using theoretically proven high-dimensional properties in neural network classifiers. We present two usages of it: 1) On the classical sentiment analysis task for the IMDB reviews dataset, and 2) our Malware-Detection task for our PowerShell scripts dataset.

Submission Track: Full Paper Track

Application Domain: Natural Language Processing

Clarify Domain: Security context, coding languages

Survey Question 1: Text classifiers using Neural network has gain popularity, but having hard time gaining human trust, as the models are considered black-boxed, and the data is non-continuous. Our new method generates explanations using simple manifold related methods, inspired by the adversarial examples research world. Our method demand no extra information of model design, simply running the same training few times, which make it much more accessible to today's real-life classification systems.

Survey Question 2: In our textual domain, having huge amount of data updating hourly, it is very hard to "look into" one classification and investigate the models' classification capabilities. Those capabilities are easily shown by looking at explanations, helping us first to do a profound security research. In the broader perspective, such explanation can advance the research significantly, finding the relevant suspicious parts on the code, and much much more.

Survey Question 3: we employ LIME, SHAP and max norm gradients, and compare the results to our novel suggested method.

Submission Number: 71

Loading