Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models

Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models

ACL ARR 2024 December Submission222 Authors

11 Dec 2024 (modified: 22 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Knowledge neuron theory provides a key approach to understanding the mechanisms of factual knowledge in Large Language Models (LLMs), which suggests that facts are stored within multi-layer perceptron neurons. This paper further explores **Degenerate Knowledge Neurons** (DKNs), where distinct sets of neurons can store identical facts, but unlike simple redundancy, they also participate in storing other different facts. Despite the novelty and unique properties of this concept, it has not been rigorously defined and systematically studied. Our contributions are: (1) We pioneer the study of structures in knowledge neurons by analyzing weight connection patterns, providing a comprehensive definition of DKNs from both functional and structural aspects. (2) Based on this definition, we develop the **Neuronal Topology Clustering** method, leading to a more accurate DKN identification. (3) We demonstrate the practical applications of DKNs in two aspects: guiding LLMs to learn new knowledge and relating to LLMs' robustness against input errors.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: knowledge tracing/discovering/inducing

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 222

Loading