Policy-Based Sentence Simplification: Replacing Parallel Corpora with LLM-as-a-Judge

Policy-Based Sentence Simplification: Replacing Parallel Corpora with LLM-as-a-Judge

ICLR 2026 Conference Submission16435 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Sentence simplification, LLM-as-a-Judge

TL;DR: We propose a method that leverages LLM-as-a-Judge to replace costly human annotation and parallel corpora, enabling effective policy-based control of LLMs for sentence simplification.

Abstract: Sentence simplification aims to modify a sentence to make it easier to read and understand while preserving the meaning. Different applications require distinct simplification policies, such as replacing only complex words at the lexical level or rewriting the entire sentence while trading off details for simplicity. However, achieving such policy-driven control remains an open challenge. In this work, we introduce a simple yet powerful approach that leverages Large Language Model-as-a-Judge (LLM-as-a-Judge) to automatically construct policy-aligned training data, completely removing the need for costly human annotation or parallel corpora. Our method enables building simplification systems that adapt to diverse simplification policies. Remarkably, even small-scale open-source LLMs such as Phi-3-mini-3.8B surpass GPT-4o on lexical-oriented simplification, while achieving comparable performance on overall rewriting, as verified by both automatic metrics and human evaluations. The consistent improvements across model families and sizes demonstrate the robustness of our approach

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 16435

Loading