Anatomy-aware Representation Learning for Medical Ultrasound

Published: 26 Jan 2026, Last Modified: 26 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Foundation model, medical ultrasound, representation learning
TL;DR: An anatomy-aware representation learning in medical ultrasound, introducing a large scale medical ultrasound dataset
Abstract: Diagnostic accuracy of ultrasound imaging is limited by qualitative variability and its reliance on the expertise of medical professionals. Such challenges increase demand for computer-aided diagnostic systems that enhance diagnostic accuracy and efficiency. However, the unique texture and structural attributes of ultrasound images, and the scarcity of large-scale ultrasound datasets hinder the effective application of conventional machine learning methodologies. To address the challenges, we propose Anatomy-aware Representation Learning (ARL), a novel self-supervised representation learning framework specifically designed for medical ultrasound imaging. ARL incorporates an anatomy-adaptive Vision Transformer (A-ViT). The A-ViT is parameterized, using the proposed large-scale medical ultrasound dataset, to provide anatomy-aware feature representations. Through extensive experiments across various ultrasound-based diagnostic tasks, including breast and thyroid cancer, cardiac view classification, and gallbladder tumor and COVID-19 identification, we demonstrate that ARL significantly outperforms existing self-supervised learning baselines. The experiments demonstrate the potential of ARL in advancing medical ultrasound diagnostics by providing anatomy-specific feature representation
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 5357
Loading