Track: Main Papers Track (6 to 9 pages)
Keywords: fairness, vision transformers
Abstract: This paper presents **FairNVT**, a lightweight debiasing framework for pretrained transformer-based encoders that improves both representation and prediction level fairness while preserving task accuracy. Unlike many existing debiasing approaches that address these notions alone, we argue they are inherently connected: representations that strongly encode sensitive information make prediction-level fairness fragile, while suppressing sensitive information at the representation level can facilitate fairer and more robust predictions.
Our approach learns task-relevant and sensitive subspaces via lightweight adapters, applies calibrated Gaussian noise to the sensitive subspace in a randomized-smoothing style, and fuses it with the task representation; together with orthogonality constraints and demographic-parity regularization, these components jointly reduce sensitive-attribute leakage in the learned embeddings and encourage fairer downstream predictions. The framework is compatible with a wide range of pretrained transformer encoders. Across three datasets spanning vision and language, FairNVT reduces sensitive-attribute attacker accuracy, improves demographic-parity and equal-opportunity metrics, and maintains high task performance.
Submission Number: 35
Loading