Language System: A Lightweight Ranking Framework for Language Models

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0
Keywords: language model, recommender system, inference-time computing
TL;DR: We rethink LLMs from the perspective of recommender systems and propose Language System, which exploits the model’s output distribution more efficiently and effectively.
Abstract: Conventional research on large language models (LLMs) has primarily focused on refining output distributions, with less attention to the decoding process that transforms these distributions into final responses. While recent work on inference-time scaling with reward models highlights the importance of decoding, such methods often incur high computational costs and limited applicability. In this paper, we revisit LLM decoding through the lens of recommender systems, conceptualizing the decoding process as analogous to the ranking stage in recommendation pipelines. From this perspective, both traditional decoding methods and reward models show clear limitations, including redundancy. To address this, we propose Language System, a lightweight framework that reranks candidate responses using features extracted by the base model. Experiments across diverse tasks demonstrate that Language System achieves performance comparable to large-scale reward models with <0.5M additional parameters, significantly reducing overhead during both training and inference. This highlights the efficiency and effectiveness of our approach in unlocking LLM capabilities.
Submission Number: 29
Loading