QUERY-SYNERGY: Leveraging English for Improving Retrieval Performance Across Multiple Languages

ACL ARR 2025 July Submission1432 Authors

29 Jul 2025 (modified: 03 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We propose Query-Synergy, a training-free approach to improving retrieval performance using multilingual embeddings. Retrieval systems depend on queries that match the document language, which may not fully exploit the abundant semantic representations available in high-resource languages. Our method utilizes additional queries in English to complement source language queries. Our method integrates similarity scores from both queries, effectively improving retrieval performance. We evaluate our approach across five languages (Arabic, Chinese, Greek, Thai, and Turkish) using four multilingual embedding models on two datasets. Our experiments show that this approach outperforms conventional source query retireval methods, achieving superior nDCG scores across various configurations and translation settings. These results confirm that Query-Synergy is a simple yet effective method for retrieval across multiple languages.
Paper Type: Short
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: Multilingualism, multilingual representations, dense retrieval, passage retrieval, resources for less-resourced languages
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models
Languages Studied: Arabic, Chinese, English, Greek, Thai, Turkish
Previous URL: https://openreview.net/forum?id=Oite6h4khG
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Limitations
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section3 and Appendix A
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: Section3 and Appendix A
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Section3 and Appendix A
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: No
B6 Elaboration: The datasets are publicly available, so readers can see it.
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section3 and Appendix A
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section3, 4 and Appendix A
C3 Descriptive Statistics: Yes
C3 Elaboration: Section3,4
C4 Parameters For Packages: Yes
C4 Elaboration: Section3
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1432
Loading