Keywords: Interpretability, Activation Maximization, Tabular Deep Learning, Large Scale Models
TL;DR: Adapting activation maximisation methods, a new feature selection method is proposed. Results in one of the largest-scale tabular NN are presented and a suggestion on how to apply it to LLM is proposed
Abstract: Interpretability of Deep Neural Networks (DNNs) is crucial when designing reliable and trustworthy models. However, there is a lack of interpretability methods for DNNs applied to tabular data. In this short paper, we propose a novel feature importance method for any Tabular Deep Learning model based on activation maximization. This allows to discard uninformative features for the network. We present some preliminary results on one of the largest scale Tabular Networks. In addition, we suggest how it can be applied to Large Language Models (LLM) to systematically study their biases too.
0 Replies
Loading