Knowledge-enhanced Parameter-efficient Transfer Learning with METER for medical vision-language tasks

Xudong Liang, Jiang Xie, Jinzhu Wei, Mengfei Zhang, Haoyang Zhang

Published: 01 Jan 2025, Last Modified: 07 Jul 2025J. Biomed. Informatics 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Objective:The full fine-tuning paradigm becomes impractical when applying pre-trained models to downstream tasks due to significant computational and storage costs. Parameter-efficient fine-tuning (PEFT) methods can alleviate the issue. However, solely applying PEFT methods leads to sub-optimal performance owing to the domain gap between pre-trained models and medical downstream tasks.Methods:This study proposes Knowledge-enhanced Parameter-efficient Transfer Learning with METER (KPL-METER) for medical vision-language (VL) downstream tasks. KPL-METER combines PEFT methods, including an innovative PEFT module for multi-modal branches and newly introduced external domain-specific knowledge to enhance model performance. First, a lightweight, plug-and-play module named Sharing Adapter (SAdapter) is developed and inserted into the multi-modal encoders. This allows the two modalities to maintain uni-modal features while encouraging cross-modal consistency. Second, a novel knowledge extraction method and a parameter-free knowledge modeling strategy are developed to incorporate domain-specific knowledge from the Unified Medical Language System (UMLS) into multi-modal features. To further enhance the modeling of uni-modal features, Adapter is added to the image and text encoders.Results:The effectiveness of the proposed model is evaluated on two medical VL tasks using three VL datasets. The results indicate that the KPL-METER model outperforms other PEFT methods in terms of performance while utilizing fewer parameters. Furthermore, KPL-METER-MED, which incorporates medical-tailored encoders, is developed. Compared to previous models in the medical domain, KPL-METER-MED tunes fewer parameters while generally achieving higher performance.Conclusion:The proposed KPL-METER architecture effectively adapts general VL models for medical VL tasks, and the designed knowledge extraction and fusion method notably enhance performance by integrating medical domain-specific knowledge. Code is available at https://github.com/Adam-lxd/KPL-METER.