Enhancing Multilingual Causal Commonsense Reasoning in LLMs: A Novel Assessment Approach and Strategy
Abstract: Commonsense reasoning is crucial for connecting premises to hypotheses by leveraging implicit world knowledge. The XCopa dataset, spanning 11 languages, serves as a benchmark for evaluating cross-lingual transfer capabilities in commonsense reasoning and emphasizes the importance of tapping into implicit knowledge for effective communication in diverse linguistic contexts. Recent advancements in Large Language Models (LLMs), such as Llama2, have made remarkable progress in Causal Commonsense Reasoning, setting new benchmarks. However, multilingual LLMs like XGLM and PolyLM face challenges due to smaller training datasets compared to English-centric LLMs. This work introduces a novel evaluation strategy, G-Evaluation, in the XCopa dataset. While this strategy resulted in decreased accuracy metrics across models, Llama2 showed improved performance, highlighting its adaptability. Despite efforts, multilingual XCopa models still fall behind their English counterparts in accuracy. Models like Llama2 exhibit performance variations across languages, underscoring the need for bridging this gap with Machine Translation (MT). To address this, we propose XTools, a strategy that combines Machine Translation and Automatic Post-Editing tools. By implementing XTools, multilingual accuracy can be elevated to 89.6\%, aligning with English performance. Our contributions include redefining the evaluation method with G-Evaluation, introducing XTools for enhancing multilingual capabilities, validating Automatic Post-Editing Tool integration, and showcasing the potential of lightweight models in improving overall performance.
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English,Italian,Indonesian,Chinese,Vietnamese,Thai,Tamil,Haitian,Estonian,German,Russian
0 Replies
Loading