BiasMap: Can Cross-Attention Uncover Hidden Social Biases?

Published: 03 Jun 2025, Last Modified: 03 Jun 2025CVPR 2025 DemoDivEveryoneRevisionsBibTeXCC BY 4.0
Keywords: stable diffusion, bias discovery, model interpretation
TL;DR: We propose BiasMap, a model-agnostic framework utilizing cross-attention attribution maps to quantify concept entanglements in diffusion models.
Abstract: Bias discovery is critical for black-box generative models, especially text-to-image (TTI) models. Existing works predominantly focus on output-level demographic distributions, which do not necessarily guarantee concept representations to be disentangled post-mitigation. We propose BiasMap, a model-agnostic framework for uncovering latent concept-level representational biases in stable diffusion models. BiasMap leverages cross-attention attribution maps to reveal structural entanglements between demographics (e.g., gender, race) and semantics (e.g., professions) concepts going deeper into representational bias within the image generation. Using attribution maps of these concepts, we quantify the spatial entanglement via Intersection over Union (IoU), offering a lens into bias that remains hidden in the individual generation process. Our findings show that existing fairness interventions may reduce the output distributional gap but often fail to disentangle concept-level coupling, which is identifiable through our bias discovery method.
Submission Number: 3
Loading