You Shouldn't Have Asked: A Pragmatics-Inspired Taxonomy for Evaluating LLM Refusals

You Shouldn't Have Asked: A Pragmatics-Inspired Taxonomy for Evaluating LLM Refusals

ACL ARR 2026 May Submission16201 Authors

26 May 2026 (modified: 02 Jun 2026)ACL ARR 2026 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Pragmatics, Facework, Safety Alignment, LLM Refusal, Human-AI Interaction

Abstract: Refusals are often treated as face-threatening acts in pragmatics because they can challenge the requester’s socially claimed self-image. Large language models (LLMs) are increasingly trained to refuse unsafe and inappropriate requests, and these refusals may harm users when models fail to manage this interactional cost properly. While existing work has mainly approached LLM non-compliance as a safety-alignment outcome, it does not provide a way to evaluate whether LLMs refuse appropriately across different harmful contexts. To study this question, we propose, to our knowledge, the first taxonomy grounded in pragmatic theories of refusal for analyzing LLM non-compliance. Applying this taxonomy to responses from 16 modern LLMs across 14 harm categories, we find that although models differ in how they refuse, their refusals are overall explicit, ethics-based, and strongly morally evaluative, with interactional repair occurring mainly through offering or providing safer alternatives instead of interpersonal facework. This pattern is especially consequential in sensitive harm contexts, where overuse of negative framing may make users feel shamed or provoked, undermining the purpose of safe non-compliance. We therefore call for alignment evaluation that considers not only whether models refuse harmful requests, but also whether they refuse in ways that are contextually adaptive and socially accountable for the interactional consequences of saying no.

Paper Type: Long

Research Area: Human-Centered NLP and Human-AI Interaction

Research Area Keywords: human-centered evaluation, human factors in NLP, human-AI interaction/cooperation

Contribution Types: Model analysis & interpretability, Data resources, Data analysis, Theory

Languages Studied: English

EMNLP 2026 AI Reviewing Experiment: no

Submission Number: 16201

Loading