How Do Llamas Process Multilingual Text? A Latent Exploration through Patchscopes

ACL ARR 2024 June Submission2192 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: A central question in multilingual language modeling is whether large language models (LLMs) develop a universal concept representation, disentangled from specific languages. In this paper, we address this question by analyzing the Llama-2's forward pass during a word translation task. We strategically extract latents from a source translation prompt and insert them into the forward pass on a target translation prompt. By doing so, we find that the output language is encoded in the latent at an earlier layer than the concept to be translated. Utilizing this insight we show that both, target concept in source language and source concept in target language, are achievable via patching alone. Furthermore, we show that patching in the mean of multiple source language latents does not impair our ability to decode source concept in target language, indicating that concept representations are language-agnostic.
Paper Type: Short
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: patchscope, multilingual, concept representation, concept space, feature space, representations
Contribution Types: Model analysis & interpretability, Reproduction study, Data analysis
Languages Studied: EN, DE, NL, ZH, ES, RU, FR, FI, ES, RU, KO.
Submission Number: 2192
Loading