Investigating the Interpretability of Biometric Face Templates Using Gated Sparse Autoencoders and Differentiable Image Parametrizations
Keywords: Biometrics, Face recognition, sparse autoencoders
TL;DR: We train a sparse autoencoder on face recognition templates and try to interpret individual sparse features through input optimization and looking at dataset samples.
Abstract: State-of-the-art face recognition models rely on deep, complex neural net architectures that produce relatively compact template vectors, making their mechanisms of operation difficult to interpret and understand. Recently, mechanistic interpretability has emerged as a promising approach to explain large language models. In this paper, we aim to apply such approaches to explain face recognition models. Our method involves transforming face image templates into sparse representations and analyzing their components by identifying images that maximize activation. Our results demonstrate that existing mechanistic interpretability techniques generalize well to previously unconsidered tasks and architectures, and that differentiable image parametrizations can serve as a useful additional means of confirming the interpretation of sparse representations.
Submission Number: 77
Loading