Keywords: Large language models, Mathematical reasoning, data synthesis
Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of problem-solving tasks. Despite their success, LLMs still face significant challenges in complex reasoning, particularly with advanced mathematical problems. These problems require not only a deep understanding of task descriptions but also sophisticated logical and mathematical reasoning to determine the correct solution path, which is often lacking in the existing synthetic data. To address this gap, we introduce WISDOM, which draws inspiration from the human learning process and employs curriculum learning to gradually synthesize high-quality CoT data from easy to hard. Our goal is to guide LLM training and improve reasoning capabilities by progressively exposing models to increasingly challenging problems. Based on the synthesized data, we further fine-tune and develop the WISDOM series models, achieving significant improvements across multiple mathematical reasoning benchmarks. Notably, WISDOM-7B (DSMath) achieves a score of 62.4% on MATH, matching GPT-4’s performance with 2/30 correct answers on AIME2024. Furthermore, WISDOM-70B (Llama3) outperforms GPT-4 on AIME2024 with 3/30 correct answers, demonstrating its potential as a better mathematical reasoner. More data and models will be available at https://anonymous.4open.science/r/Wisdom-math-377B
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9495
Loading