everyone
since 12 Dec 2024">EveryoneRevisionsBibTeXCC BY 4.0
Recently, substantial advancements have been made in training language models to carry out step-by-step reasoning for solving intricate numerical reasoning tasks. Beyond the methods used to solve these problems, the structure and formulation of the problems themselves also play a crucial role in determining the performance of large language models. We observe that even small changes in the surface form of mathematical problems can have a profound impact on both the answer distribution and solve rate. This highlights the vulnerability of LLMs to surface-level variations, revealing its limited robustness when reasoning through complex problems. In this paper, we propose RM-POT, a method that first reformulates the surface form of mathematical problems and then applies the Program of Thoughts (PoT) approach to solve them. PoT disentangles computation from reasoning, thereby enhancing the reasoning capabilities of LLMs. Experiments on various datasets demonstrate the effectiveness of our proposed method.