Morphology of Chinese Characters: Evaluating LLMs and VLMs on Visual Features and Radical Prompting for NLP Tasks

ACL ARR 2024 June Submission3129 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As a glyphic language, Chinese incorporates information-rich visual features below the character level, such as radicals which can provide hints about meaning or pronunciation. However, we argue that Large Language Models (LLMs) and Vision-Language Models (VLMs) fail to identify or harness these valuable features. Our study evaluates LLMs and VLMs in identifying visual information in Chinese characters, such as radicals, composition structures, strokes, and stroke count. Additionally, we design "radical prompting" to explore enhancements for LLMs in NLP tasks utilizing radical information. Results demonstrate most LLMs and VLMs struggle to recognize any visual information in Chinese characters. The introduction of `radical prompting' led to some improvements in LLM performance across NLP tasks, but significant improvement was seen only when correct radicals were provided, as observed in part-of-speech (POS) tagging task.
Paper Type: Long
Research Area: Phonology, Morphology and Word Segmentation
Research Area Keywords: Morphology,
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: Chinese
Submission Number: 3129
Loading