Abstract: Radiology report generation, predicting text descriptions for radiological images, may face critical challenges due to data imbalance – medical tokens appear less frequently than regular tokens, and normal entries are significantly more than abnormal ones. However, very few studies consider the imbalance issues, not even with conjugate imbalance factors. In this study, we jointly consider two imbalance factors, label and token, determining distributions of radiology images and language, which are two fundamental modalities of the text generation task. We propose a $\textbf{J}$oint $\textbf{I}$mbalance $\textbf{A}$daptation ($\textit{JIMA}$) model to promote task robustness by leveraging token and label imbalance. Experiments on two standard evaluation data (IU X-ray (Demner-Fushman et al., 2015) and MIMIC-CXR (Johnson et al., 2019)) by automatic and human evaluations demonstrate our significant improvements over current state- of-the-art models. We conduct extensive abla- tion and case analyses to examine and present dual imbalance effects on the radiology report generation robustness. While data imbalance remains challenging, our approach opens new directions for the generation task.
Paper Type: long
Research Area: Generation
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
0 Replies
Loading