DAIQ: Auditing Demographic Attribute Inference from Neutral Question in LLMs

ACL ARR 2026 January Submission7200 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Evaluation, Value Alignment, Interpretability, Robustness
Abstract: Recent evaluations of Large language models (LLMs) audit social bias primarily through prompts that explicitly reference demographic attributes, overlooking whether models infer sensitive demographics from neutral questions. Such inference constitutes epistemic overreach and raises concerns for privacy. We introduce Demographic Attribute Inference from Questions (DAIQ), a diagnostic audit framework for evaluating demographic inference under epistemic uncertainty. We evaluate 18 open- and closed-source LLMs across six real-world domains and five demographic attributes. We find that many models infer demographics from neutral questions, defaulting to socially dominant categories and producing stereotype-aligned rationales. These behaviors persist across model families, scales and decoding settings, indicating reliance on learned population priors. We further show that inferred demographics can condition downstream responses and that abstention oriented prompting substantially reduces unintended inference without model fine-tuning. Our results suggest that current bias evaluations are incomplete and motivate evaluation standards that assess not only how models respond to demographic information, but whether they should infer it at all.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Evaluation, Privacy, Value Alignment, Interpretability, Robustness, Fairness
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 7200
Loading