Emotional Manipulation is All You Need: A Framework for Evaluating Healthcare Misinformation in LLMs

Emotional Manipulation is All You Need: A Framework for Evaluating Healthcare Misinformation in LLMs

ICLR 2025 Workshop BuildingTrust Submission60 Authors

10 Feb 2025 (modified: 06 Mar 2025)Submitted to BuildingTrustEveryoneRevisionsBibTeXCC BY 4.0

Track: Long Paper Track (up to 9 pages)

Keywords: Medical AI safety, Prompt Injection Attacks, Emotional manipulation in LLMs, LLM jailbreak vulnerabilities, Healthcare-specific adversarial attacks, Trust and safety in medical LLMs, LLM jailbreak vulnerabilities

TL;DR: This paper examines how emotional manipulation combined with prompt injection attacks can increase healthcare misinformation generation in Large Language Models (LLMs), specifically in cancer treatment advice.

Abstract: Warning: This paper discusses potentially harmful healthcare misinformation patterns and LLM vulnerabilities The integration of Large Language Models (LLMs) into healthcare applications has raised critical concerns about their susceptibility to generating harmful medical misinformation, particularly when faced with emotionally manipulated prompt injection attacks. Through systematic evaluation of 112 attack scenarios in eight state-of-the-art LLMs, we reveal that emotional manipulation coupled with prompt injection can increase the generation of dangerous medical misinformation without warning from a baseline of 6.2% to 37.5%. We also found that emotional content not only amplifies attack success rates, but also leads to more severe forms of misinformation. Notably, models vary widely in their susceptibility - while some models like Claude 3.5 Sonnet demonstrate strong resistance across all attack types, others show high vulnerability to emotional manipulation. These findings not only underscore critical vulnerabilities in LLM safety filters, but also emphasize the urgent need for enhanced protection against emotionally manipulated prompt injection attacks. Given that 39\% of the US population already believe in alternative cancer treatments, our research highlights the life-threatening implications of AI-generated health misinformation and provides crucial insights for developing more robust safety mechanisms before deployment in clinical settings.

Submission Number: 60

Loading