Abstract: Multilingual semantic search is the task of retrieving relevant contents to a query expressed in different language combinations. It is less explored and more challenging than its monolingual or bilingual counterparts, due to the need to circumvent ``language bias''. Overcoming language bias requires a stronger alignment approach to pull the contents to be retrieved close to the representation of their corresponding queries no matter their language combinations. Traditionally, this is achieved through more supervision in the form of multilingual parallel resources which is expensive to obtain. In this work, we propose a novel alignment approach: MAML-Align, specifically for low-resource multilingual semantic search. Our approach leverages meta-distillation learning on top of MAML, an optimization-based Model-Agnostic Meta-Learner. MAML-Align distills knowledge from a Teacher meta-transfer model T-MAML, specialized in transferring from monolingual to bilingual semantic search, to a Student model S-MAML, which transfers from bilingual to multilingual semantic search. To the best of our knowledge, we are the first to extend meta-distillation to a multilingual search application. Our low-resource evaluation shows that on top of a strong baseline based on sentence transformers, our meta-distillation approach significantly outperforms naive fine-tuning and vanilla MAML.
Paper Type: long
Research Area: Multilinguality and Language Diversity
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: Arabic, German, Greek, Hindi, Russian, Thai, Turkish, Spanish, English
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading