Ensemble Answer Selection Leveraging Cross-Lingual Dealignment for Improved Question Answering With Mixture-of-Experts Setup

Published: 2025, Last Modified: 19 Jan 2026IEEE Open J. Comput. Soc. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Question answering (QA) research is actively progressing with single expert and multi-expert large language models (LLMs). Existing multi-expert QA systems use simple majority voting or highest confidence ratio alone to determine model prediction accuracy. Using only confidence ratio for answer selection in multi-expert settings tends to favor models that typically produce high-confidence predictions. Majority voting on the other hand can be misleading when the integrated experts share similar architecture and reasoning patterns, leading to convergent but potentially incorrect outputs. To address these challenges, we propose an ensemble technique that combines confidence ratio, majority voting, and a sum-of-scores tiebreaker to improve answer selection in multi-expert systems. To investigate the effectiveness of our approach, we introduce mixture-of-experts (MoE) setups consisting encoder-decoder and encoder-only LLMs. Experiments across 22 languages from the MLQA, TyDiQA, and AfriQA benchmarks show that our approach outperforms GPT4o and Llama-2 13B, by over 10 F1 points in 6 of 9 low-resourced African languages. Similar gains are obtained across high-resource languages, despite the use of foundational models in our MoE setups. Overall, our approach improves existing state-of-the-art baselines for cross-lingual QA in several high and low resource languages. Our code and models are publicly available.
Loading