Surgical-VQLA++: Adversarial contrastive learning for calibrated robust visual question-localized answering in robotic surgery

Published: 01 Jan 2025, Last Modified: 15 Jan 2025Inf. Fusion 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a Surgical-VQLA++ framework to connect answering and localization.•We incorporate feature calibration and adversarial contrastive training techniques.•We expand our datasets by including additional queries related to surgical tools.•Extensive experiments prove the effectiveness and robustness of our solution.
Loading