Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic DimensionDownload PDF

Published: 04 Mar 2023, Last Modified: 16 May 2023ME-FoMo 2023 PosterReaders: Everyone
Keywords: Geometry, Hidden representations, Text-to-image models
TL;DR: We try to understand the connection between prompt and the geometry of internal representations in diffusion models.
Abstract: Prompting has become an important mechanism by which users can more effectively interact with many flavors of foundation model. Indeed, the last several years have shown that well-honed prompts can sometimes unlock emergent capabilities within such models. While there has been a substantial amount of empirical exploration of prompting within the community, relatively few works have studied prompting at a mathematical level. In this work we aim to take a first step towards understanding basic geometric properties induced by prompts in Stable Diffusion, focusing on the intrinsic dimension of internal representations within the model. We find that choice of prompt has a substantial impact on the intrinsic dimension of representations at both layers of the model which we explored, but that the nature of this impact depends on the layer being considered. For example, in certain bottleneck layers of the model, intrinsic dimension of representations is correlated with prompt perplexity (measured using a surrogate model), while this correlation is not apparent in the latent layers. Our evidence suggests that intrinsic dimension could be a useful tool for future studies of the impact of different prompts on text-to-image models.
0 Replies

Loading