Abstract: Idioms such as "call it a day" and "piece of cake" are ubiquitous in natural language. How are idioms processed by Transformer language models? This study investigates this question on three models - BERT, Multilingual BERT and DistilBERT. We compare the embeddings of idiom and literal expressions across all layers of the networks on the sentence level and on the word level. We also explore the attention from other sentence tokens towards a word inside an idiom compared to a literal context. Results show that the three language models have different inner workings, but they all represent idioms differently to literal language, with attention being a crucial mechanism. The findings suggest that idioms are semantically and syntactically idiosyncratic, not only for humans but also for language models.
Paper Type: short
0 Replies
Loading