Abstract: Summarizing medical forums requires clinical precision. However, traditional supervised fine-tuning (SFT) methods rely on static datasets that cannot adapt to evolving clinical language and diverse user queries. This study aims to overcome these limitations by combining SFT with synthetic data generation and direct preference optimization (DPO). Using PerAnsSumm dataset, we trained a Mistral 7B model to generate synthetic preference data labeled as “rejected” and mark expert summaries as “chosen”. KeyBERT-extracted keywords augmented the inputs to enhance contextual relevance. Our DPO-adapted model significantly outperformed the baselines, achieving scores of 0.458 (ROUGE-Lsum), 0.511 (SacreBLEU), and 0.880 (METEOR). Keyword integration prevented performance degradation when adapting to new summary types, increasing the METEOR score by 13.4\% in originally excluded categories. This study confirms that using synthetic data and preference optimization reduces the need for costly annotations and enables flexible, clinically precise summarization.
Paper Type: Short
Research Area: Summarization
Research Area Keywords: Summarization, Synthetic Data, Preference Optimization, Clinical NLP
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English
Submission Number: 2713
Loading