Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

ICLR 2026 Conference Submission20563 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: post-training, language models, distributional learning, alignment, pluralistic alignment, uncertainty estimation
TL;DR: We contribute a novel dataset and post-training method to improve in-context steerability and distributional alignment and coverage, and characterize weaknesses to current post-training techniques along these desiderata.
Abstract: Language model post-training has enhanced instruction-following and performance on many downstream tasks, but also comes with an often-overlooked cost on tasks with many possible valid answers. We characterize three desiderata: in-context steerability, valid output space coverage, and distributional alignment, and document across three model families how post-training can reduce these properties. In particular, we disambiguate between two kinds of in-context learning: ICL for eliciting existing underlying knowledge or capabilities, and in-context steerability, where a model must use in-context information to override its priors and steer to a novel data generating distribution. To better evaluate and improve these desiderata, we introduce Spectrum Suite, a large-scale resource compiled from $>40$ data sources and spanning $>90$ tasks requiring models to steer to and match diverse distributions. We find that while instruction-tuning helps elicit underlying capabilities and models, it hurts a model’s ability to flexibly steer in-context. To mitigate these issues, we propose Spectrum Tuning, a post-training method using Spectrum Suite to improve steerability and distributional coverage. We find that Spectrum Tuning often improves over pretrained models and their instruction-tuned counterparts, enhancing steerability, spanning more of the out- put space, and improving distributional alignment on held-out datasets.
Supplementary Material: pdf
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 20563
Loading