Contrastive VQ Priors for Multi-Class Plaque Segmentation via SAM Adaptation

Published: 21 Apr 2026, Last Modified: 21 Apr 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Accurate plaque subtype segmentation in coronary CT angiography (CCTA) is clinically relevant yet remains difficult in practice, where annotations are scarce, and the visual evidence for non-calcified lesions is subtle and highly variable. Meanwhile, segmentation foundation models such as SAM provide strong robustness from large-scale pretraining, but their benefits do not reliably transfer to private CCTA tasks under naïve fine-tuning, especially for multi-class plaque taxonomy. We present a targeted strategy to transfer SAM's segmentation robustness to a private CCTA setting by injecting a task-specific, texture-aware prior into the SAM feature stream. Our framework is two-stage: (i) we learn a discrete latent prior from the private CCTA data using a vector-quantized autoencoder, and structure it with supervised contrastive learning to emphasize hard class boundaries; (ii) we fuse this prior into a SAM-based encoder through a query-based feature-aware cross-attention module, and decode with a multi-class head/decoder tailored for plaque taxonomy. On this private CCTA cohort, the proposed design improves overall performance over the compared baselines, with the largest gains on vessel wall and non-calcified plaque. Ablations suggest that the class-structured prior, query-based fusion, and multi-class decoding each contribute to the final result within this setting.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We uploaded the de-anonymized camera-ready version of the manuscript and appendix. The revisions are limited to exposition polishing and claim calibration, following the Action Editor’s minor-revision guidance. Specifically, we consistently frame the paper as a targeted study in a private five-class CCTA setting rather than a broad claim about SAM adaptation or cross-domain robustness; we strengthened the discussion of the class-wise results and ablations, especially the larger gains on vessel wall and non-calcified plaque, the smaller gains on lumen and calcified plaque, and the contribution of the multi-class decoder, FAA, and supervised contrastive structuring; and we preserved explicit discussion of the evaluation scope and limitations, including the absence of a directly matched external benchmark. We also corrected minor wording and formatting issues in the main paper and appendix. No new experiments were added.
Supplementary Material: pdf
Assigned Action Editor: ~Chenyu_You1
Submission Number: 7140
Loading