Understanding Mistakes in Transformers through Token-level Semantic Dependencies

Ruo-Jing Dong; Yu Yao; Bo Han; Tongliang Liu

Understanding Mistakes in Transformers through Token-level Semantic Dependencies

Ruo-Jing Dong, Yu Yao, Bo Han, Tongliang Liu

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: transformer mistakes, token-level semantic depdendencies

Abstract: Despite the high performance of the transformer model, it sometimes produces incorrect information. To understand the cause of this issue, we explore how semantic dependency is learned within the model. Specifically, we investigate how tokens in multi-head self-attention transformer models encode semantically dependent information. To help us identify semantic information encoded within a token, intuitively, our method analyzes how a token's value shifts in response to changes in semantics. BERT, LLaMA, and GPT models are analyzed. We have observed some interesting and similar behaviors in their mechanisms for encoding semantically dependent information: 1). Most tokens primarily retain their original semantic information, even as they pass through multiple layers. 2). A token in the final layer usually encodes truthful semantic dependencies. 3). The semantic dependency within a token is sensitive to both irrelevant context changes and the order of contexts. 4). Mistakes made by the model can be attributed to some tokens that falsely encode semantic dependencies. Our findings potentially can help develop more robust and accurate transformer models by pinpointing the mechanisms behind semantic encoding.

Primary Area: interpretability and explainable AI

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8699

Loading