Keywords: Self-improvement for LLM
Abstract: Large language models (LLMs) excel in generating coherent text but are limited by their large parameters and high memory requirements. Recent studies suggest that dynamically adjusting inference operations can enhance performance without significantly increasing model size. We introduce the stutter mechanism, which enables self-improvement by selectively applying additional layers to challenging tokens, mimicking a human stutter to allocate more computational effort where needed. Our experiments with Pythia models show that the stutter mechanism consistently improves performance across benchmarks. Notably, the Pythia-410M-stutter model outperforms the larger Pythia-1B model on WinoGrande and WSC. Additionally, our method is data-efficient, requiring less than 1% of the pretraining data for additional training. These results demonstrate the stutter mechanism’s potential to enhance LLMs’ efficiency and performance in real-world applications.
Submission Number: 24
Loading