Endoscopic Scoring and Localization in Unconstrained Clinical Trial Videos

Jinlin Xiang, Hillol Sarker, Bozhao Qi, Ruisu Zhang, Roger Trullo, Salvatore Badalamenti, Maria Wiekowski, Annie Kruger, Etienne Pochet, Qi Tang, Wei Zhao

Published: 01 Jan 2025, Last Modified: 15 May 2025WACV 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Endoscopic assessment using the Mayo clinic score (or Mayo score, 4 categories) is currently the standard for diagnosing and evaluating mucosal disease activities. However, annotating Mayo scores is time-consuming and often relies on weakly labeled evaluations from central and local readers (doctors), leading to a large number of unlabeled or mislabeled video clips. Additionally, such labels also suffer from data imbalance due to patient distributions and varying disease severity levels. This gap underscores the need for more customizable and refined methods for endoscopic scoring and localization. To address these challenges, we introduce an end-to-end pipeline with a new dataset for endoscopic scoring and localization in unconstrained clinical trial videos. Specifically, we propose an automated scoring system that includes an active learning-based preprocessing stage for cleaning the raw videos, which are then weakly labeled by doctors. Therefore, we obtain a comprehensive dataset for endoscopic Mayo scoring, comprising approximately 49.9 hours of video and 86, 423 clips, providing a solid foundation for developing endoscopic scoring and localization models. Then, we propose a video classification model for Mayo score classifications. For personalized disease quantification in localization, we introduce a novel 1D trajectory model with a novel cumulative disease score that addresses the limitations of previous 3D trajectory projection methods. Our dataset and end-to-end pipeline offer a valuable foundation for advancing endoscopic clinical trial research.