Looking Into the Black Box - How Are Idioms Processed in BERT?

Anonymous

Looking Into the Black Box - How Are Idioms Processed in BERT?

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone

Abstract: Idioms such as ``call it a day'' and ``piece of cake'' are ubiquitous in natural language. How are idioms processed by language models such as BERT? This study investigates this question with three experiments: (1) an analysis of embedding similarities of idiomatic sentences and their literal spelled-out counterparts, (2) an analysis of word embeddings when the word appears in an idiomatic versus literal context, and (3) an attention analysis of words when they appear in an idiomatic versus literal context. Each of these three experiments analyse results across all layers of BERT. Experiment 1 shows that the cosine similarity of the embeddings of an idiom sentence and its spelled-out counterpart increases the deeper the layer. However, when compared to random controls, layer 8 is where the spelled-out counterpart is ranked highest in embedding similarity. Experiment 2 shows that the embedding of single words in idiomatic versus literal contexts diverge and become the most different in layer 8 also. Experiment 3 shows that other sentence tokens pay less attention to a word inside an idiom compared to the same word in a literal sentence. Overall, the study suggests that BERT ``understands'' idiomatic expressions, and that it processes them more akin to a syntactic phenomenon than purely a semantic one. A mechanism for this understanding in BERT is attention, which illustrates that idioms are semantically and syntactically idiosyncratic.

Paper Type: long

0 Replies

Loading