Tensor-SAE: Structured Sparse Autoencoders for Interpretable and Efficient Image Representations

Tanush Ajay Shastry; Soham Batra; Laksh Patel; Aarav Lala; Andrew Bae; Siddharth Karuturi; Mithil Shah; Neel N Shanbhag

Tensor-SAE: Structured Sparse Autoencoders for Interpretable and Efficient Image Representations

Tanush Ajay Shastry, Soham Batra, Laksh Patel, Aarav Lala, Andrew Bae, Siddharth Karuturi, Mithil Shah, Neel N Shanbhag

Published: 02 Mar 2026, Last Modified: 11 Mar 2026ICLR 2026 Workshop GRaM PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 8 pages)

Keywords: Sparse Autoencoders, Unsupervised Learning, Image Reconstruction, Interpretable Models, Computational Efficiency

TL;DR: We introduce Tensor-SAE, a structured sparse autoencoder that decodes through a learned bank of rank-1 tensor atoms (color × height × width).

Abstract: We introduce Tensor-SAE, a structured sparse autoencoder that decodes through a learned bank of rank-1 tensor atoms (color × height × width). By factorizing the decoder into separable color and spatial factors and applying a light sparsity prior on latent activations, Tensor-SAE induces compact, interpretable representations that enable linear, spatially localized, and semantically meaningful interventions in image reconstructions. Unlike unconstrained dense or convolutional decoders that distribute information diffusely, Tensor-SAE enforces a strong inductive bias that trades some raw pixel-level fidelity for computational efficiency, interpretability, and controllability. We evaluate Tensor-SAE on CIFAR-10 against two baselines (a parameter-matched Dense-SAE and a ConvAE scaled to match parameter budgets). Our empirical suite (six figures) demonstrates that Tensor-SAE: (1) learns low-entropy spatial atoms and clean color factors; (2) yields linearly predictable intervention effects (R2 ≈ 0.93) enabling controllable color edits; (3) achieves superior reconstruction efficiency per FLOP and per parameter; (4) produces consistently sparse latents; and (5) stabilizes intervention strength during training. We discuss trade-offs, limitations, and the application of Tensor-SAE as a building block for interpretable, compute-efficient generative systems.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 102

Loading