Keywords: Model Merging, Knowledge Fusion, Efficient Specialization of LLMs, Modular LLM Adaptation.
TL;DR: SkillWeave modularizes LLM capabilities into lightweight skillpacks and enables modular, efficient, and self-improving LLMs with superior speed and performance.
Abstract: We propose SkillWeaving, a modular self-improvement framework that enables large language models (LLMs) to specialize using only their own generations. The key idea is to decompose general-purpose LLMs into a collection of skillpacks—lightweight, domain-specific delta modules—that reorganize and refine the model’s internal knowledge. By combining rule-based verification with preference optimization, each skillpack learns high-quality self-refined capability from a specific domain, achieving robust improvement without external labels. To further ensure scalable deployment, we introduce SkillZip, a fully quantized delta-compression method that jointly quantizes weights and activations, eliminating runtime decompression and enabling fast, low-cost inference. By merging shared knowledge and hardware-aware technical designs, our method achieves both parameter-efficient specialization and inference-efficient execution. On multi-task and agentic benchmarks, Skill Weaving outperforms expert-tuned baselines and even surpasses larger 32B monolithic models using only 9B parameters, while offering up to 4× speedup. Overall, our approach offers an interpretable and resource-efficient path toward LLM self-improvement.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 8938
Loading