Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies

Laurenz Adrian Heidrich; Aditya Rastogi; Priyank Upadhya; Gianluca Brugnara; Martha Foltyn-Dumitru; Benedikt Wiestler; Philipp Vollmuth

Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies

Laurenz Adrian Heidrich, Aditya Rastogi, Priyank Upadhya, Gianluca Brugnara, Martha Foltyn-Dumitru, Benedikt Wiestler, Philipp Vollmuth

Published: 27 Mar 2025, Last Modified: 22 May 2025MIDL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Medical Image Analysis, Deep Learning, Tumor Detection, Curriculum Learning

TL;DR: Novel language-guided tumor detection pipeline improving accuracies across pathologies and imaging modalities using curriculum learning

Abstract: Pathology detection in medical imaging is crucial for radiologists, yet current approaches that train specialized models for each region of interest often lack efficiency and robustness. Furthermore, the scarcity of annotated medical data, particularly for diverse phenotypes, poses significant challenges in achieving generalizability. To address these challenges, we present a novel language-guided object detection pipeline that leverages curriculum learning strategies, chosen for their ability to progressively train models on increasingly complex samples, thereby improving generalization across pathologies, phenotypes, and modalities. We developed a unified pipeline to convert segmentation datasets into bounding box annotations, and applied two curriculum learning approaches - teacher curriculum and bounding box size curriculum - to train a Grounding DINO model. Our method was evaluated on different tumor types in MRI and CT scans and showed significant improvements in detection accuracy. The teacher and bounding box size curriculum learning approaches yielded a 4.9\% AP and 5.2\% AP increase over baseline, respectively. The results highlight the potential of curriculum learning to optimize medical image analysis and clinical workflow. The code is available at https://github.com/CCI-Bonn/CL4OD.

Primary Subject Area: Application: Radiology

Secondary Subject Area: Detection and Diagnosis

Paper Type: Validation or Application

Registration Requirement: Yes

Reproducibility: https://github.com/CCI-Bonn/CL4OD

Visa & Travel: Yes

Midl Latex Submission Checklist: Ensure no LaTeX errors during compilation., Created a single midl25_NNN.zip file with midl25_NNN.tex, midl25_NNN.bib, all necessary figures and files., Includes \documentclass{midl}, \jmlryear{2025}, \jmlrworkshop, \jmlrvolume, \editors, and correct \bibliography command., Did not override options of the hyperref package, Did not use the times package., All authors and co-authors are correctly listed with proper spelling and avoid Unicode characters., Author and institution details are de-anonymized where needed. All author names, affiliations, and paper title are correctly spelled and capitalized in the biography section., References must use the .bib file. Did not override the bibliographystyle defined in midl.cls. Did not use \begin{thebibliography} directly to insert references., Tables and figures do not overflow margins; avoid using \scalebox; used \resizebox when needed., Included all necessary figures and removed *unused* files in the zip archive., Removed special formatting, visual annotations, and highlights used during rebuttal., All special characters in the paper and .bib file use LaTeX commands (e.g., \'e for é)., Appendices and supplementary material are included in the same PDF after references., Main paper does not exceed 9 pages; acknowledgements, references, and appendix start on page 10 or later.

Latex Code: zip

Copyright Form: pdf

Submission Number: 116

Loading