Abstract: Fine-tuning multi-turn dialogue systems requires high-quality data but tends to degrade with low-quality or out-of-distribution (OOD) samples. Early errors accumulate, amplifying inconsistencies and degrading response quality. However, existing methods separate data quality control from fine-tuning, overlooking turn-level dependencies and cumulative noise, which hinders end-to-end optimization in multi-turn settings. To bridge this gap, we propose TWiNS (Turn-weighted Welford-based implicit Noise Suppression), an end-to-end adaptive fine-tuning method that implicitly pinpoints noisy samples and then suppresses their gradient contributions over the course of model tuning on the fly, mitigating error accumulation and preserving coherence in multi-turn dialogues. Specifically, turn-aware weighting maintains contextual coherence, while Welford’s online algorithm adjusts sample weights without pre-filtering. Experiments show that TWiNS ensures stable optimization across multi-turn dialogues, enhancing performance on individual and mixed-quality datasets while mitigating degradation. By suppressing noise without explicit filtering, it adapts to evolving data distributions with zero pre-filtering overhead, establishing a new paradigm for end-to-end data-quality optimization in multi-turn dialogue systems.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: multi-turn dialogue systems, fine-tuning, suppressing noise, grounded dialog
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 3363
Loading