Partial-Correlation Learning for Large Language Models with Skip-Tuning

ICLR 2026 Conference Submission22378 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Supervised Fine-Tuning
Abstract: Large Language Models (LLMs) require post-training to adapt to specific applications, with Supervised Fine-Tuning (SFT) crucial for injecting emerging or domain-specific knowledge. Conventional SFT using complete sequential text risks causing a distribution shift from pretraining corpora due to large volumes of common-style text, potentially leading to overfitting and catastrophic forgetting. We introduce Skip-Tuning, a novel fine-tuning strategy that utilizes noncontinuous text segments instead. Skip-Tuning performs skipped language modeling on text segments and enables a paradigm of partial-correlation learning, where the model learns from sparse but meaningful text fragments. By excluding common-style texts and using only knowledge-intensive text for fine-tuning, Skip-Tuning demonstrates improvements in fine-tuning effectiveness and generalization in the knowledge editing setting. Furthermore, we demonstrate the effectiveness of partial-correlation learning in a system-prompt following task, which illustrates the broad application of Skip-Tuning across various NLP scenarios.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 22378
Loading