LoRE: Enhancing Search Relevance with Progressive Chain-of-Thought and Preference Alignment

LoRE: Enhancing Search Relevance with Progressive Chain-of-Thought and Preference Alignment

ACL ARR 2026 January Submission10081 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Relevance Search, Large Language Models, Classification, Domain-specific Fine-tuning

Abstract: E-commerce search relevance is a critical component of retrieval systems. While Large Language Models (LLMs)-driven Chain-of-Thought (CoT) modeling has become the dominant paradigm and yielded significant gains, a critical gap remains: the absence of a systematic definition for comprehensive relevance reasoning, which leads to significant blind spots in current approaches. In this paper, we deconstruct the task into three core competencies: reasoning \& knowledge, multi-modal understanding, and rule awareness. Accordingly, we propose LoRE(\underline{\textbf{L}}arge Generative M\underline{\textbf{o}}del for Search \underline{\textbf{R}}elevanc\underline{\textbf{e}}), a novel two-stage training framework. We first employ an SFT phase to instill these capabilities via a progressive CoT synthesis pipeline, followed by a Reinforcement Learning(RL) phase, which serves as a regularizer, pruning redundant logic to achieve precise and robust adjudication. Extensive experiments validate LoRE, outperforming GPT-5 by 29.1\% in Macro-F1 and achieving a 27\% online gain, offering a vital reference for industrial domain-specific post-training.

Paper Type: Long

Research Area: Language Models

Research Area Keywords: chain-of-thought, fine-tuning, generative models, data augmentation, business NLP

Contribution Types: NLP engineering experiment, Theory

Languages Studied: English, Chinese

Submission Number: 10081

Loading