Topic-XICL: Demonstration Selection with Topic Inference for Cross-lingual In-context LearningDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Cross-lingual in-context learning (XICL) shows promise for adapting large language models (LLMs) to low-resource languages. Previous methods rely on off-the-shelf retrievers or task-specific retrievers based on feedback signals from LLM for demonstration selection, they often overlook important factors beyond semantic similarity or can be resource-costly. To address these challenges, we propose a novel approach called Topic-XICL, which leverages a latent topic model to select demonstrations across languages. We assume that latent topic variables incorporate additional information beyond semantics, such as syntax and task structure. By training this topic model on rich-resource language data with a small parameter LLM, we obtain more informative demonstrations by topic inference and utilize them for in-context learning across various LLMs. Our method is tested on three multilingual tasks (XNLI, XCOPA, and TydiQA-GoldP) using three different-size BLOOMZ models and three models with approximately 7 billion parameters (BLOOM, XGLM, and Llama2). Comparative evaluations against random selection, semantic similarity selection, and clustering-based selection baselines show consistent improvements in multilingual average performance with our approach.
Paper Type: long
Research Area: Multilinguality and Language Diversity
Contribution Types: Approaches to low-resource settings
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview