Abstract: We present a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD). The teacher relies on a highly effective but expensive two-stage process consisting of query translation and monolingual IR, while the student executes a single CLIR step. We teach the student powerful multilingual encoding as well as CLIR by optimizing two corresponding KD objectives. Learning useful non-English representations from an English-only retriever is accomplished through a cross-lingual token alignment algorithm that relies on the representation capabilities of the underlying multilingual language model. In both in-domain and zero-shot evaluation, the proposed method demonstrates far superior accuracy over direct fine-tuning with labeled CLIR data. One of our systems is also the current best single-model system on the XOR-TyDi leaderboard.
Paper Type: short
0 Replies
Loading