everyone
since 09 May 2025">EveryoneRevisionsBibTeXCC BY 4.0
Large Language Models (LLMs) have shown impressive performance on reasoning tasks, especially when conducted in English. However, leveraging multilingual capabilities can significantly enhance reasoning effectiveness. In this paper, we comprehensively explore the benefits of multilingualism in reasoning using BenchMAX and highlight a range of intriguing phenomena. Our findings indicate that employing multiple languages can provide additional advantages, with a notably high upper bound for these benefits. This upper bound demonstrates remarkable tolerance for variations in translation quality and language choice, yet it remains sensitive to the methods used for answer selection. Unfortunately, common answer selection strategies often fail to unlock the full potential of multilingualism. Further analysis of the benefits and challenges shows that key languages like Korean and French can enhance the reasoning abilities of various models, and common answer selection struggles because it depends on language combinations and its performance does not improve with more languages. These insights may pave the way for future research aimed at fully harnessing the potential of multilingual reasoning in LLMs~\footnote{The code will be made publicly available.}