MIRACL: A Robust Framework for Multi-Label Learning on Noisy Multimodal Electronic Health Records

16 Sept 2025 (modified: 09 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal Learning, Multi-label Learning
TL;DR: A robust multi-label learning framework for noisy multimodal EHRs that leverages cross-visit and cross-modal perspectives for noise detection and correction.
Abstract: Multimodal Electronic Health Records (EHRs), comprising structured time-series data and unstructured clinical notes, offer complementary views of patient health. However, multi-label prediction tasks on multimodal EHR data, such as phenotyping, are hindered by potential label noise, including false positives and negatives. Existing noisy-label learning methods, often designed for single-label vision data, fail to capture real label-dependencies or account for the cross-modal, longitudinal nature of EHRs. To address this, we propose MIRACL (\textbf{M}ultimodal \textbf{I}nstance \textbf{R}elabelling \textbf{A}nd \textbf{C}orrection for multi-\textbf{L}abel noise (MIRACL\footnote{\url{https://github.com/anon-coder-def/MIRACL}})), a novel framework that systematically addresses these challenges. Notably, MIRACL is the first framework designed to explicitly leverage longitudinal patient context to resolve more challenging multi-label noise scenarios. To achieve this, MIRACL unifies three synergistic mechanisms: (1) a difficulty- and rank-based metric for robust identification of noisy instance-label pairs, (2) a class-aware correction module for robust label refinements, promoting the recovery of real label-dependencies, and (3) a patient-level contrastive regularization loss that leverages both cross-modal and longitudinal patient context to correct for noisy supervision across different visits. Extensive experiments on large-scale multimodal EHR datasets (MIMIC-III/IV) demonstrate that MIRACL achieves state-of-the-art robustness, improving test mAP by over 2\% under various noise levels.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 7114
Loading