Reusable Low-Rank Subspaces Explain Why Cross-Modal Transfer Adapts with Tiny Updates

09 May 2026 (modified: 09 May 2026)ICML 2026 Workshop CoLoRAI SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: low rank adaptation, transfer learning, multi-modal, time series
TL;DR: Pretrained LLMs already contain low-rank, forecasting-compatible subspaces. Cross-modal time-series transfer works because LoRA selects and refines these reusable directions instead of building them from scratch.
Abstract: Parameter-efficient finetuning methods such as LoRA routinely adapt massive pretrained transformers to new tasks using only tiny low-rank updates, but the representational geometry that makes this possible remains unclear. We use cross-modal transfer from a language-pretrained transformer to time-series forecasting as a controlled probe of low-rank adaptation, asking why so few directions are sufficient. Across adaptation regimes, LoRA recovers most of the transfer benefit of full finetuning; effective-rank analyses show that pretrained representations already concentrate on a low-rank subspace that finetuning \emph{redistributes} rather than rebuilds; and a single linear projection over frozen hidden states aligns with realistic time-series trajectories without paired supervision. Randomly initialized models, by contrast, first construct a compressed representation through a uniform layer-wise collapse before they can specialize. These results support a view of cross-modal adaptation as low-rank \emph{direction selection} within reusable pretrained subspaces.
Submission Number: 142
Loading