CORE: Discovering Intrinsic Ranking Preferences in LLMs via Consistent Ego-Correction

16 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Listwise Reranking, Large Language Models, Information Retrieval, Prompt Sensitivity, Consistency Regularization, Robustness
TL;DR: To fix LLMs' unreliable rankings, our CORE method fine-tunes them to ignore superficial prompt variations and consistently follow their own core understanding of relevance.
Abstract: Large language models are powerful listwise rerankers, but their performance is notoriously sensitive to prompt variations, undermining their reliability for real-world applications. This paper introduces CORE (Consistent Reranking via Ego-correction), a new fine-tuning framework that mitigates this instability by learning a model's intrinsic, prompt-invariant ranking preferences. CORE integrates two complementary mechanisms: a guidance strategy adapted from Classifier-Free Guidance to calibrate the generative process against stylistic variations, and a consistency loss based on differentiable Kendall's Tau to regularize the model's internal ordinal judgments. On standard TREC Deep Learning and BEIR benchmarks, CORE establishes new state-of-the-art ranking performance. Crucially, it demonstrates superior robustness, reducing performance variance across diverse prompts by over 80% compared to standard fine-tuning. Our work presents a principled and effective method for building powerful and trustworthy LLM-based reranking systems.
Primary Area: generative models
Submission Number: 7316
Loading