Auditing Counterfire: Evaluating Advanced Counterargument Generation with Evidence and Style

Anonymous

Auditing Counterfire: Evaluating Advanced Counterargument Generation with Evidence and Style

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: We audited counter-arguments generated by large language models (LLMs), focusing on their ability to generate evidence-based and stylistic counter-arguments to posts from the Reddit ChangeMyView dataset. Our evaluation is based on Counterfire: a new dataset of 32,000 counter-arguments generated from large language models (LLMs): GPT-3.5 Turbo and Koala and their fine-tuned variants, and PaLM 2, with varying prompts for evidence use and argumentative style. GPT-3.5 Turbo ranked highest in argument quality with strong paraphrasing and style adherence, particularly in `reciprocity' style arguments. However, the `No Style' counter-arguments proved most persuasive on average. The findings suggest that a balance between evidentiality and stylistic elements is key to an effective counter-argument. We close with a discussion of future research directions and implications for fine-tuning LLMs.

Paper Type: long

Research Area: Computational Social Science and Cultural Analytics

Contribution Types: Data resources, Data analysis

Languages Studied: English

0 Replies

Loading