Keywords: In-context Learning, Large language models, Cross-domain retrieval, Demonstration selection, Chain of Embedding
TL;DR: We use the PCA hull volume of the Chain-of-Embedding as a label-free signal to decide when demonstration diversity helps, boosting cross-domain ICL.
Abstract: In-context learning is an emergent capability of large language models that solves unseen tasks by conditioning on a few demonstrations without updating parameters. ICL performance hinges on how the sampler selects demonstrations. In deployment, distribution shift is common and target labels are scarce due to privacy and cost, making cross-domain retrieval both important and challenging. Yet sampler behavior under such shifts remains underexplored. Our analysis shows that the relevance–performance and diversity–performance relationships vary by domain, and that characterizing this trade-off in the target domain is essential for sampler generalization in cross-domain settings. However, without labels it is impossible to evaluate this trade-off, so we leverage the geometry of LLM embedding trajectories (the Chain of Embedding, CoE), defined as the sequence of hidden states across layers, as a label-free signal. We show that broader exploration of the trajectory, measured by PCA hull volume, flags domains where diversity correlates positively with performance. Building on this insight, we propose LRPG (Latent Reasoning Path Guidance), a lightweight method that decides whether to increase diversity based on CoE statistics without requiring labels. Across diverse benchmarks, existing samplers suffer large drops in cross-domain settings, whereas LRPG consistently improves target-domain ICL performance and composes orthogonally with existing sampler designs.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 5705
Loading