Dialectic Argumentations for Oversight Reasoning

Dialectic Argumentations for Oversight Reasoning

ACL ARR 2026 January Submission2277 Authors

02 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Debate for better oversight, Reasoning

Abstract: Debate has emerged as a promising Large Language Models (LLMs) oversight mechanism amid rising systematic complexity and constrained scalability in evaluation, notably where models outperform human evaluators. Yet Debate provides little verifiable evidence for its final judgments, and its scalability beyond English remains largely unexplored. To make oversight grounded and scale as capabilities extend, we propose a Dialectic Argumentation framework as a reasoning function to extend the Debate paradigm to multilingual and multimodal settings. We employ a weak-to-strong oversight approach based on two expert models that evaluate and defend contesting answers, while a third blind judge determines the winner using Dialectic Argumentation. Experts argue only for belief-consistent answers, founding the Debate on disagreements. We experimented with six tasks on our framework in both multilingual and multimodal scenarios, and dialectic argumentation consistently outperforms single-expert baselines. Moreover, we show that dialectic judgements from a weaker model deliver argument-mediated supervision that, via fine-tuning, instils unsupervised reasoning signals in expert models.

Paper Type: Long

Research Area: Low-resource Methods for NLP

Research Area Keywords: Debate for better oversight, Reasoning in Large and Small LMs

Contribution Types: Model analysis & interpretability, Reproduction study, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English, French, Chinese, Spanish, Italian, Hindi, Arabic, Finnish

Submission Number: 2277

Loading