Keywords: Knowledge Annealing, Domain Adaptation, Knowledge Graph, Synthetic Data, Large Language Models
Abstract: Achieving deep expertise in vertical domains requires more than exposing Large Language Models (LLMs) to vast corpora; it demands a structured internalization of complex logic. Standard fine-tuning often treats data as a disordered bag of tokens, failing to capture the intricate dependencies essential for high-level reasoning. We propose Domain Knowledge Annealing (DKA), a curriculum learning paradigm designed to maximize domain proficiency. Inspired by physical crystallization, DKA organizes training from Local to Global: it begins by injecting discrete concepts via entity-centric samples ("heating") and progressively advances to synthesizing complex, cross-document relationships ("cooling"). This structured progression allows the model to solidify isolated facts into a coherent knowledge system. Experiments on Linguistics and Law benchmarks demonstrate that DKA significantly surpasses standard strategies, establishing a new state-of-the-art in domain-specific reasoning and understanding.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: graph-based methods, knowledge-augmented methods, transfer learning / domain adaptation, generalization
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 4542
Loading