Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation

ACL ARR 2024 June Submission5185 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Foundation models (FMs) have achieved significant success across various tasks, leading to research on benchmarks for commonsense and reasoning abilities. However, there is a lack of studies on FMs performance in exceptional scenarios. This paper addresses these cases for the first time, developing a novel dataset for comprehensive FMs evaluation across multiple modalities, including graphic novels, calligraphy, news articles, and lyrics. It includes tasks for instance classification, character recognition, token prediction, and text generation. The paper also proposes prompt engineering techniques like Chain-of-Thought (CoT) and CoT+Few-Shot to enhance performance. Validation of FMs using various methods revealed improvements. The code repository is accessible at: https://anonymous.4open.science/r/Exceptional-Dataset-for-FMs/README.md
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Resources and Evaluation
Contribution Types: Data resources, Data analysis
Languages Studied: English, Korean, Python
Submission Number: 5185
Loading