A quantitative analysis of semantic information in deep representations of text and images

Santiago Acevedo; Andrea Mascaretti; Riccardo Rende; Matéo Mahaut; Marco Baroni; Alessandro Laio

A quantitative analysis of semantic information in deep representations of text and images

Santiago Acevedo, Andrea Mascaretti, Riccardo Rende, Matéo Mahaut, Marco Baroni, Alessandro Laio

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Semantic representations, Representation Similarity, DeepSeek-V3, Information Imbalance.

TL;DR: Deep networks generate “semantic” representations in specific inner layers. We show that this semantic content is spread across many tokens, shows long‑range causal correlations, and exhibits model-dependent asymmetries.

Abstract: Deep neural networks are known to develop similar representations for semantically related data, even when they belong to different domains, such as an image and its description, or the same text in different languages. We present a method for quantitatively investigating this phenomenon by measuring the relative information content of the representations of semantically related data and probing how it is encoded into multiple tokens of large language models (LLMs) and vision transformers. Looking first at how LLMs process pairs of translated sentences, we identify inner "semantic'' layers containing the most language-transferable information. We also identify layers encoding semantic information within visual transformers. We show that caption representations in the semantic layers of LLMs predict visual representations of the corresponding images. We observe significant and model-dependent information asymmetries between image and text representations.

Primary Area: interpretability and explainable AI

Submission Number: 16158

Loading