SAT-RRG: Self-Adaptive Training for Radiology Report Generation Leveraging LLMs for Dynamic Token-Level Refinement

SAT-RRG: Self-Adaptive Training for Radiology Report Generation Leveraging LLMs for Dynamic Token-Level Refinement

ACL ARR 2025 May Submission2589 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Existing radiology report generation (RRG) methods rely on word-level alignment with reference reports, making them overly sensitive to surface phrasing and blind to semantically valid variations. These methods lack semantic feedback mechanisms during training, treating all tokens uniformly and failing to prioritize critical corrections. As a result, models cannot dynamically assess or refine report quality, leading to clinically suboptimal outputs. We propose SAT-RRG, a self-adaptive training framework that identifies phrase-level semantic errors and provides token-level supervision by both correcting mistakes and reinforcing accurate predictions. We introduce two custom loss functions: CTAL, which consolidates confidently correct tokens, and ETAPL, which penalizes overconfident semantic errors. Both adapt to the evolving confidence landscape during training. Our framework builds upon a unified LLM backbone for both generation and error detection, ensuring no additional computational overhead during inference. SAT-RRG achieves state-of-the-art performance on MIMIC-CXR and IU-Xray. The code will be released upon publication.

Paper Type: Long

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Research Area Keywords: clinical NLP, Multimodality,

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 2589

Loading