Comparative Evaluation of Temporal Modeling in Foundation Models for Echocardiographic Left Ventricular Ejection Fraction Estimation

15 Apr 2026 (modified: 16 Apr 2026)MIDL 2026 Short Papers SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: echocardiography, cardiac ultrasound, left ventricular ejection fraction estimation, foundation models, medical foundation models, vision-language model, video foundation model, temporal modeling, frozen probing, contrastive learning, masked autoencoder, EchoNet-Dynamic, CAMUS
TL;DR: A unified frozen-probe evaluation across four vision and video foundation models shows that temporal benefit in echocardiographic LVEF is model-dependent, with strong gains only for a video-contrastive model.
Registration Requirement: Yes
Abstract: Automated left ventricular ejection fraction (LVEF) estimation has relied on image-level representations despite echocardiography being an inherently temporal modality. In this study, we conduct a comparative evaluation of four pretrained models covering both image-based representations and video-based representations to study the impact of incorporating video information directly in the estimation of left ventricular ejection fraction. The models (OpenCLIP, EchoCLIP, Echo-Vision-FM, and VideoCLIP) were tested under identical frozen MLP probes across 10 seeds on EchoNet-Dynamic dataset. Our results indicate that VideoCLIP achieved the best 16-frame result (4.79\% MAE), while mean-pooled image encoders showed no improvement over single-frame baselines. Reconstruction-based pretraining encoded temporal structure but was not linearly accessible under frozen probing ($R^{2}=0.178$). Within this frozen-probe benchmark, our observed differences in temporal benefit were more consistent with pretraining objective than with input modality alone.
Reproducibility: Codebase available upon request
Visa & Travel: No
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 81
Loading