Localizing the Center of Surgical Action in Laparoscopic Videos: A Point-Supervised Heatmap Regression Approach

14 Apr 2026 (modified: 16 Apr 2026)MIDL 2026 Short Papers SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Laparoscopic camera navigation, Center of surgical action, Heatmap regression, Point localization, Automated skill assessment
TL;DR: To automate laparoscopic camera centering assessment, this study formalizes center of surgical action localization via heatmap regression, showing a DINOv3-ViT-S model predicts coordinates within 20% of the frame diagonal in 95% of cases.
Registration Requirement: Yes
Abstract: Automated assessment of laparoscopic camera centering requires reliable spatial localization of the center of surgical action (COSA). We formalize this task as a point-localization problem addressed via heatmap-regression. We introduce a dataset of 500 annotated laparoscopic video clips and evaluate three encoder-decoder architectures (UNet-ResNet-34, Segformer-MiT-B1, and a custom DINOv3-ViT-S). DINOv3 achieved the best performance, with approximately 95\% of detections deviating less than 20\% of the frame diagonal from ground truth. This establishes a strong baseline for the COSA localization task and provides an open benchmark for future camera-navigation research.
Reproducibility: https://gitlab.gwdg.de/cds/lcn-cosa-localization
Visa & Travel: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 58
Loading