Keywords: Low-Rank Adaptation, Gated Modulation, Language Style Modeling
Abstract: The AI assistant for official accounts must emulate the distinctive language styles of millions of authors when addressing users' questions. Traditional approaches rely on training separate models for each author's style or embedding style information directly into prompts; some also use chain-of-thought (CoT) methods. However, these methods lead to an explosion in the number of parameters or limited generalization capabilities, as well as inefficiencies, making them unscalable for applications involving millions of styles.
We propose a novel decomposition-fusion framework that decomposes language styles into multiple orthogonal dimensions—such as semantics, syntax, grammar, and word order—and pretrains LoRA modules for each dimension. A gating network is explicitly trained to aggregate author style embeddings, dynamically computing weighted coefficients for each dimension. By combining multiple fine-tuned LoRA parameters through these learned weights, the model achieves personalized style expression.
This method enables a single model to represent an enormous variety of styles, significantly improves zero-shot generalization to unseen author styles, and provides interpretable style representations. Experimental results demonstrate enhanced personalization and naturalness in question-answering generation while maintaining stylistic diversity. Our framework exhibits strong controllability, scalability, and interpretability, offering an effective solution for extreme-scale, multi-style language modeling.
Primary Area: generative models
Submission Number: 16669
Loading