Abstract: Detecting logical fallacies in texts could improve online discussion quality by helping users spot argument flaws and construct better arguments. However, automatically identifying logical fallacies in the wild is not easy. Fallacies are often buried inside arguments that sound convincing; over 100 types of logical fallacies exist. Building large labeled datasets needed for developing automatic fallacy detection models can be expensive. This paper introduces CoCoLoFa, the largest logical fallacy dataset, containing 5,772 comments for 647 news articles, with each comment labeled for fallacy presence and type. To collect data, we first specified a fallacy type (eg, slippery slope) and a news article to crowd workers, then asked them to write comments that embody the fallacy in response to the article. We built an LLM-powered assistant in the interface to help workers draft and refine comments. Experts rated the writing quality and labeling validity of CoCoLoFa as high and reliable. Models trained on CoCoLoFa achieved the highest fallacy detection performance (F1=0.65) on real-world news comments from the New York Times, surpassing those trained on other datasets and even GPT-4.
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: Data resources
Languages Studied: English
0 Replies
Loading