DAIQ: Auditing Demographic Attribute Inference from Neutral Question in LLMs

DAIQ: Auditing Demographic Attribute Inference from Neutral Question in LLMs

ACL ARR 2026 January Submission7200 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Evaluation, Value Alignment, Interpretability, Robustness

Abstract: Recent evaluations of Large language models (LLMs) audit social bias primarily through prompts that explicitly reference demographic attributes, overlooking whether models infer sensitive demographics from neutral questions. Such inference constitutes epistemic overreach and raises concerns for privacy. We introduce Demographic Attribute Inference from Questions (DAIQ), a diagnostic audit framework for evaluating demographic inference under epistemic uncertainty. We evaluate 18 open- and closed-source LLMs across six real-world domains and five demographic attributes. We find that many models infer demographics from neutral questions, defaulting to socially dominant categories and producing stereotype-aligned rationales. These behaviors persist across model families, scales and decoding settings, indicating reliance on learned population priors. We further show that inferred demographics can condition downstream responses and that abstention oriented prompting substantially reduces unintended inference without model fine-tuning. Our results suggest that current bias evaluations are incomplete and motivate evaluation standards that assess not only how models respond to demographic information, but whether they should infer it at all.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: Evaluation, Privacy, Value Alignment, Interpretability, Robustness, Fairness

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 7200

Loading