TL;DR: How robustness are LLMs when facing the conflict
Abstract: This study investigates the robustness of Large Language Models when confronted with conflicting information between their memory and prompts. Such conflicts are frequently encountered in real-world applications, notably in retrieval augmentation LLM-based products. Specifically, we assess the robustness of LLMs from two aspects: factual robustness targeting the ability to identify the correct fact from prompts or memory, and secondly, regardless of the correctness, decision style to categorize LLMs’ behavior in making consistent choices. Our findings, derived from extensive experiments on seven LLMs, reveal that these models are highly susceptible to misleading prompts. While detailed instructions can mitigate the selection of misleading answers, they also increase the incidence of invalid responses. After unraveling the model’s decision-making style, we intervene with different-sized LLMs through the specific style of role instructions to change the style. This step allows us to measure their adaptability in role-playing — a critical aspect that had not been quantitatively assessed before. By setting different roles, we explore the effects on factual robustness, thereby getting the upper-bound of the model.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies
Loading