Persistent Topological Features in Large Language Models

Yuri Gardinazzi; Karthik Viswanathan; Giada Panerai; Alessio ansuini; Alberto Cazzaniga; Matteo Biagetti

Persistent Topological Features in Large Language Models

Yuri Gardinazzi, Karthik Viswanathan, Giada Panerai, Alessio ansuini, Alberto Cazzaniga, Matteo Biagetti

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We introduce zigzag persistence for the interpretation of internal representations of large language models

Abstract: Understanding the decision-making processes of large language models is critical given their widespread applications. To achieve this, we aim to connect a formal mathematical framework—zigzag persistence from topological data analysis —with practical and easily applicable algorithms. Zigzag persistence is particularly effective for characterizing data as it dynamically transforms across model layers. Within this framework, we introduce topological descriptors that measure how topological features, $p$-dimensional holes, persist and evolve throughout the layers. Unlike methods that assess each layer individually and then aggregate the results, our approach directly tracks the full evolutionary path of these features. This offers a statistical perspective on how prompts are rearranged and their relative positions changed in the representation space, providing insights into the system’s operation as an integrated whole. To demonstrate the expressivity and applicability of our framework, we highlight how sensitive these descriptors are to different models and a variety of datasets. As a showcase application to a downstream task, we use zigzag persistence to establish a criterion for layer pruning, achieving results comparable to state-of-the-art methods while preserving the system-level perspective.

Lay Summary: Large language models (LLMs), like those behind popular AI chatbots, can generate impressively human-like responses to text prompts, but how they actually process information inside remains mostly unknown. This lack of transparency, given the widespread use of these models in more important tasks, raises serious concerns in the scientific, and broader, community. Researchers also want to make these large models smaller and less resource-intensive, without losing their effectiveness. To tackle both issues, our work brings in tools from mathematics, specifically “topological data analysis,” which is good at describing complex shapes and relationships in data. We apply a mathematical approach called zigzag persistence to track how information evolves across each layer of an LLM, instead of looking at each layer separately. This lets us measure how groups of data points change and interact through the whole model. With this method, we’re able to spot different “phases” in how the model processes language inputs, and we can use our findings to suggest which model layers could be removed (pruned) to compress the model—without major performance loss. Our approach works on different models and datasets, offering a new window into how LLMs actually work, and paving the way to safer, more efficient AI systems.

Primary Area: Deep Learning->Large Language Models

Keywords: Topological Data Analysis, Persistent Homology, Large Language Models, Internal Representations, Layer Pruning

Submission Number: 10988

Loading