Keywords: Progressive LoRA Rank Shrinking, Sequential Fine Tuning, Training Schedules, Difficulty Aware Training, Math Word Problems
Abstract: This work investigates a human-inspired sequential fine-tuning (SeqFT) method to improve the performance of resource-constrained large language models (LLMs) on math word problems. Instead of training on the entire dataset simultaneously, models are exposed to progressively harder tasks level by level, while earlier data is periodically reintroduced to mitigate catastrophic forgetting. In addition, a strategy called Progressive LoRA Rank Shrinking (PLRS) is proposed, which progressively reduces the LoRA rank at each stage to prevent the overwriting of parameters learned in earlier levels. Evaluations on the MATH dataset demonstrate that this approach consistently outperforms both parameter efficient fine-tuning and naive multi-level training, yielding up to a 2%-7% improvement in exact match accuracy. The study presents the effect of (1) repeated data exposure, (2) difficulty based task ordering via SeqFT, and (3) PLRS. An
analysis of problem-solving trajectories further reveals that PLRS facilitates retention of earlier skills in a multi-stage setup. These findings suggest that, beyond conventional data augmentation, carefully designed training schedules can significantly enhance math problem-solving capabilities in LLMs.
Submission Number: 98
Loading