Does FLUX Know What It’s Writing?

Published: 30 Sept 2025, Last Modified: 30 Sept 2025Mech Interp Workshop (NeurIPS 2025) PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Probing, Diffusion models
TL;DR: We investigate whether FLUX.1 has learned abstract representations of the letters it is generating.
Abstract: Text-to-image models are historically bad at generating text within images (e.g., a slogan on a t-shirt), but recent state-of-the-art models like FLUX.1 have shown significant improvements in legible text generation. Does this mean that FLUX has learned abstract representations of the letters it is generating? We investigate the implicit representations of inpainting diffusion models by printing characters onto an evenly spaced grid and prompting the model to fill in masked characters. By probing the latent representations of these character grids in various components of the model, we find evidence of generalizable letter representations in middle transformer layers that suggest a notion of letter identity consistent across fonts.
Submission Number: 93
Loading