Understanding intermediate layers using linear classifier probes

Guillaume Alain; Yoshua Bengio

Understanding intermediate layers using linear classifier probes

Guillaume Alain, Yoshua Bengio

12 Jul 2025 (modified: 22 Jun 2025)Submitted to ICLR 2017Readers: Everyone

TL;DR: Investigating deep learning models by proposing a different concept of information

Abstract: Neural network models have a reputation for being black boxes. We propose a new method to better understand the roles and dynamics of the intermediate layers. Our method uses linear classifiers, referred to as "probes", where a probe can only use the hidden units of a given intermediate layer as discriminating features. Moreover, these probes cannot affect the training phase of a model, and they are generally added after training. We demonstrate how this can be used to develop a better intuition about models and to diagnose potential problems.

Keywords: Deep learning, Supervised Learning, Theory

Conflicts: umontreal.ca, google.com

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/understanding-intermediate-layers-using/code)

3 Replies

Loading