Attributes Shape the Embedding Space of Face Recognition Models

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We propose a metric to quantify how face recognition embeddings remain invariant to interpretable attributes.
Abstract: Face Recognition (FR) tasks have made significant progress with the advent of Deep Neural Networks, particularly through margin-based triplet losses that embed facial images into high-dimensional feature spaces. During training, these contrastive losses focus exclusively on identity information as labels. However, we observe a multiscale geometric structure emerging in the embedding space, influenced by interpretable facial (e.g., hair color) and image attributes (e.g., contrast). We propose a geometric approach to describe the dependence or invariance of FR models to these attributes and introduce a physics-inspired alignment metric. We evaluate the proposed metric on controlled, simplified models and widely used FR models fine-tuned with synthetic data for targeted attribute augmentation. Our findings reveal that the models exhibit varying degrees of invariance across different attributes, providing insight into their strengths and weaknesses and enabling deeper interpretability. Code available here: https://github.com/mantonios107/attrs-fr-embs.
Lay Summary: Face Recognition (FR) models map every image to a point in a high-dimensional, abstract space, called the embedding space. Here, faces of the same identity are represented closer than faces of different identities. To this end, models should filter out unidentifying image properties to perform correct recognition. Yet, we still lack a clear view of which properties, and particularly interpretable attributes, organise that space. We frame the question at two levels: the macroscale, across identities, and the microscale, inside identities. Without insight at either scale, hidden geometry can entangle demographic traits or lighting quirks, undermining fairness and robustness. We introduce a distance-based analysis for the macroscale and an invariance energy measure for the microscale to quantify how strongly each attribute shapes the embedding space. Our investigation uncovers a consistent behaviour of recent FR models. Further fine-tuning with targeted augmentations on an attribute increases the corresponding invariance energy, confirming that our measures accurately track FR invariance to interpretable cues. Taken together, the two scales provide a concise geometric fingerprint that lets practitioners audit and compare face-recognition systems. By exposing hidden biases and guiding attribute-specific training, our method advances face biometrics and possibly other metric-learning tasks toward greater transparency, fairness, and resilience.
Link To Code: https://github.com/mantonios107/attrs-fr-embs
Primary Area: Applications->Computer Vision
Keywords: Face Recognition, Representation Learning, Interpretability, Embeddings, Contrastive Loss
Submission Number: 12598
Loading