Teaching LLMs According to Their Aptitude: Adaptive Switching Between CoT and TIR for Mathematical Problem Solving

Teaching LLMs According to Their Aptitude: Adaptive Switching Between CoT and TIR for Mathematical Problem Solving

ACL ARR 2025 May Submission4984 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Existing supervised fine-tuning (SFT) approaches to enhance the mathematical reasoning of large language models (\textbf{LLMs}) rely either on Chain-of-Thought (CoT) for generalizability or Tool-Integrated Reasoning (TIR) for precise computation. While efforts have been made to combine these methods, they primarily rely on post-selection or predefined strategies, leaving an open question: Could we endow LLMs with the ability to adaptively determine whether to use CoT or TIR based on the math problems at hand? In this work, we propose \textbf{TATA} (\textbf{T}eaching LLMs \textbf{A}ccording to \textbf{T}heir \textbf{A}ptitude), an adaptive framework that enables LLMs to personalize their reasoning strategy for different problems spontaneously, aligning it with their intrinsic aptitude. TATA incorporates base-LLM-aware data selection during SFT to tailor training data to the model’s unique abilities, which equips LLMs to autonomously determine and apply the effective reasoning strategy at test time. Empirical results demonstrate that TATA effectively combines the complementary strengths of CoT and TIR, achieving superior or comparable performance with improved inference efficiency compared to existing methods. Further analysis underscores the critical role of aptitude-aware data selection in enabling LLMs to make effective and adaptive reasoning decisions and align reasoning strategies with model capabilities.

Paper Type: Long

Research Area: Question Answering

Research Area Keywords: mathematical NLP, math QA, chain-of-thought, tool-integrated reasoning, fine-tuning

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study

Languages Studied: English, Python

Submission Number: 4984

Loading