Dissecting Chronos: Sparse Autoencoders Reveal Causal Feature Hierarchies in Time Series Foundation Models

Anurag Mishra

Dissecting Chronos: Sparse Autoencoders Reveal Causal Feature Hierarchies in Time Series Foundation Models

Anurag Mishra

Published: 01 Mar 2026, Last Modified: 10 Apr 2026ICLR 2026 TSALM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Presentation Attendance: No, we cannot present in-person

Keywords: sparse autoencoders, time series foundation models, mechanistic interpretability, Chronos-T5, causal feature ablation

TL;DR: First SAE study on a time series foundation model: features in Chronos-T5's mid-encoder are causally critical change detectors, while the semantically richest final encoder layer paradoxically improves under ablation.

Abstract: Time series foundation models (TSFMs) are increasingly deployed in high-stakes domains, yet their internal representations remain opaque. We present the first application of sparse autoencoders (SAEs) to a TSFM, training TopK SAEs on activations of Chronos-T5-Large (710M parameters) across six layers. Through 392 single-feature ablation experiments, we establish that every ablated feature produces a positive CRPS degradation, confirming causal relevance. Our analysis reveals a depth-dependent hierarchy: early encoder layers encode low-level frequency features, the mid-encoder concentrates causally critical change-detection features, and the final encoder compresses a rich but less causally important taxonomy of temporal concepts. The most critical features reside in the mid-encoder (max single-feature $\Delta\ CRPS = 38.61$), not in the semantically richest final encoder layer, where progressive ablation paradoxically improves forecast quality. These findings demonstrate that mechanistic interpretability transfers effectively to TSFMs and that Chronos-T5 relies on abrupt-dynamics detection rather than periodic pattern recognition.

Track: Research Track (max 4 pages)

Submission Number: 108

Loading