Track: Short paper
Keywords: Valence, Threatening, Supportive, Neutral, Emotion, Framing, Performance
Abstract: Large Language Models (LLMs) are influenced not only by the structure and wording of prompts but also by their emotional tone. This paper investigates how prompt valence—neutral, supportive, and threatening tones—shapes LLM performance across various measures of output quality. We propose a dual-pipeline framework for controlled prompt generation and evaluation, ensuring factual equivalence while systematically varying tone. Responses were graded using a structured rubric, validated on a pilot set, that captures accuracy, relevance, coherence, depth, linguistic quality, instruction sensitivity, and creativity. Results show that neutral prompts generally maximize reliability, supportive prompts introduce moderate variability, and threatening prompts introduce increased variability. These effects differ across models, indicating that valence interacts with model-specific sensitivities. Overall, the findings suggest that emotional framing acts as a hidden controllability axis in LLM behavior, with implications for robust and safe deployment.
Submission Number: 51
Loading