Security Knowledge Dilution in Large Language Models: How Irrelevant Context Degrades Critical Domain Expertise

Published: 22 Sept 2025, Last Modified: 25 Nov 2025DL4C @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, AI Safety, Security, Context Dilution, Domain Expertise
TL;DR: This paper identifies a critical, previously understudied failure mode in LLMs called 'knowledge dilution', where specialized expertise systematically degrades during natural conversations containing relevant but off-topic technical content.
Abstract: Large Language Models (LLMs) demonstrate remarkable capabilities across diverse domains, yet their performance can be unexpectedly fragile when specialized knowledge is required. We investigate a novel phenomenon we term 'knowledge dilution', the degradation of domain-specific expertise when models are exposed to large volumes of irrelevant but contextually plausible information. Through a controlled experiment involving 400 code generation tasks across varying levels of context dilution, we demonstrate that security-focused knowledge in LLMs systematically degrades as irrelevant technical content increases in the conversation context. Our findings reveal that security feature implementation drops by 47% when moving from focused contexts (0 dilution tokens) to heavily diluted contexts (40,000 dilution tokens), with statistical significance (p < 0.001). This work has critical implications for AI safety, particularly in security-critical applications where domain expertise degradation could lead to vulnerable systems. While demonstrated here in the security domain using GPT-4, this phenomenon likely represents a fundamental challenge for maintaining specialized expertise in production LLM deployments across critical domains.
Submission Number: 71
Loading