Universal and extensible language-vision models for organ segmentation and tumor detection from abdominal computed tomography

Jie Liu, Yixiao Zhang, Kang Wang, Mehmet Can Yavuz, Xiaoxi Chen, Yixuan Yuan, Haoliang Li, Yang Yang, Alan L. Yuille, Yucheng Tang, Zongwei Zhou

Published: 2024, Last Modified: 13 Nov 2025Medical Image Anal. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•A universal framework adapting a single model to multiple datasets and new classes.•A language-driven parameter generator leverages embeddings from CLIP.•We develop a class-specific, lightweight head to ease the addition of new classes.•Universal Model is efficient, generalizable, transferable, and extensible.•Universal Model ranks first in MSD and BTCV competitions for medical segmentation.