Abstract: Phrases are fundamental linguistic units through which humans convey semantics. This study critically examines the capacity of API-based large language models (LLMs) to comprehend phrase semantics, utilizing three human-annotated datasets.
We assess the performance of LLMs in executing phrase semantic reasoning tasks guided by natural language instructions and explore the impact of common prompting techniques, including few-shot demonstrations and Chain-of-Thought reasoning.
Our findings reveal that LLMs greatly outperform traditional embedding methods across the datasets; however, they do not show a significant advantage over fine-tuned methods. The effectiveness of advanced prompting strategies shows variability. We conduct detailed error analyses to interpret the limitations faced by LLMs in comprehending phrase semantics.
Paper Type: Short
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: phrase semantics;phrase alignment;phrase embedding
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 4062
Loading