Aligned but Stereotypical? Understanding and Mitigating Social Bias in LLM-Driven Text-to-Image Models

NaHyeon Park; Na Min An; Kunhee Kim; Soyeon Yoon; Jiahao Huo; Hyunjung Shim

Aligned but Stereotypical? Understanding and Mitigating Social Bias in LLM-Driven Text-to-Image Models

NaHyeon Park, Na Min An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim

Published: 05 Mar 2026, Last Modified: 06 Mar 2026ICLR 2026 Workshop RSI PosterEveryoneRevisionsCC BY 4.0

Keywords: social bias, llm, self-audit, text-to-image generation

TL;DR: We propose a test-time framework that enables LLMs to self-audit and construct fairness-aware system prompts for unbiased image generation in LLM-based T2I models.

Abstract: LLM-based text-to-image (T2I) systems improve prompt understanding, but their effect on demographic bias remains under-explored. In this paper, we find that recent LLM-based T2I models produce more demographically biased images than non-LLM baselines. To study this behavior, we introduce SocBiasBench, a 1,024-prompt benchmark spanning four levels of prompt complexity. Using decoded-text analysis, token-probability probes, and embedding-space analysis, we find that system-prompt conditioning is an important pathway through which demographic priors affect image generation. To this end, we propose FairPro, a training-free test-time method that uses the embedded LLM to construct an input-dependent system prompt that mitigates stereotypical demographic completions while preserving user intent. Across recent LLM-based T2I models, FairPro reduces demographic bias while preserving text-image alignment, suggesting that system prompts are a practical intervention point for fairer T2I generation.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 61

Loading