ArgBench: Benchmarking LLMs on Computational Argumentation

ArgBench: Benchmarking LLMs on Computational Argumentation

ACL ARR 2026 January Submission9002 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: computational argumentation, argument mining, LLMs

Abstract: Argumentation skills are an essential toolkit for Large Language Models (LLMs). These skills are crucial in various use cases, including self-reflection, debating collaboratively for diverse answers, and countering hate speech. In this paper, we create the first benchmark for a standardized evaluation of LLM-based approaches to computational argumentation, encompassing 33 datasets from previous work in unified form. Using the benchmark, we evaluate the ability of five LLM families in 46 computational argumentation tasks that cover mining, perspective and quality assessment, reasoning about, and generating arguments. On the benchmark, we conduct an extensive analysis of the contribution of few-shot examples, reasoning steps, and model size on LLM performance on the computational argumentation tasks in the benchmark.

Paper Type: Long

Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining

Research Area Keywords: computational argumentation, argument mining, LLMs

Contribution Types: NLP engineering experiment, Data resources, Surveys

Languages Studied: English

Submission Number: 9002

Loading