MApLe: Multi-instance Alignment of Diagnostic Reports and Large Medical Images

Felicia Bader; Philipp Seeböck; Anastasia Bartashova; Ulrike Attenberger; Georg Langs

MApLe: Multi-instance Alignment of Diagnostic Reports and Large Medical Images

Felicia Bader, Philipp Seeböck, Anastasia Bartashova, Ulrike Attenberger, Georg Langs

Published: 14 Feb 2026, Last Modified: 14 Feb 2026MIDL 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Vision–language models, Multi-instance learning

TL;DR: A multi-instance alignment model that links small image regions to sentences from diagnostic reports.

Abstract: In diagnostic reports, experts encode complex imaging data into clinically actionable information. They describe subtle pathological findings that are meaningful in their anatomical context. Reports follow relatively consistent structures, expressing diagnostic information with few words that are associated with often tiny but consequential image observations. Standard vision language models struggle to identify the associations between these informative text components and small locations in the images. Here, we propose "MApLe", a multi-task, multi-instance vision language alignment approach that overcomes these limitations. It disentangles the concepts of anatomical region and diagnostic finding, and links local image information to sentences in a patch-wise approach. Our method consists of a text embedding trained to capture anatomy and diagnostic concepts in sentences, a patch-wise image encoder conditioned on anatomical structures, and a multi-instance alignment of these representations. We demonstrate that MApLe can successfully align different image regions and multiple diagnostic findings in free-text reports. We show that our model improves the alignment performance compared to state-of-the-art baseline models when evaluated on several downstream tasks.

Primary Subject Area: Integration of Imaging and Clinical Data

Secondary Subject Area: Unsupervised Learning and Representation Learning

Registration Requirement: Yes

Read CFP & Author Instructions: Yes

Originality Policy: Yes

Single-blind & Not Under Review Elsewhere: Yes

LLM Policy: Yes

Submission Number: 400

Loading