Research Area: Science of LMs, LMs and the world
Keywords: Large Language Models, Latent Representations, Factual Knowledge, Activation Patching, Graphs, Knowledge Graphs
TL;DR: A framework based on the technique of activation patching to represent the factual knowledge embedded in the vector space of LLMs as dynamic knowledge graphs.
Abstract: Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of factual knowledge.
However, understanding their underlying reasoning and internal mechanisms in exploiting this knowledge remains a key research area.
This work unveils the factual information an LLM represents internally for sentence-level claim verification.
We propose an end-to-end framework to decode factual knowledge embedded in token representations from a vector space to a set of ground predicates, showing its layer-wise evolution using a dynamic knowledge graph.
Our framework employs activation patching, a vector-level technique that alters a token representation during inference, to extract encoded knowledge.
Accordingly, we neither rely on training nor external models.
Using factual and common-sense claims from two claim verification datasets, we showcase interpretability analyses at local and global levels.
The local analysis highlights entity centrality in LLM reasoning, from claim-related information and multi-hop reasoning to representation errors causing erroneous evaluation.
On the other hand, the global reveals trends in the underlying evolution, such as word-based knowledge evolving into claim-related facts.
By interpreting semantics from LLM latent representations and enabling graph-related analyses, this work enhances the understanding of the factual knowledge resolution process.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 911
Loading