Surgical-VQLA++: Adversarial contrastive learning for calibrated robust visual question-localized answering in robotic surgery
Abstract: Highlights•We propose a Surgical-VQLA++ framework to connect answering and localization.•We incorporate feature calibration and adversarial contrastive training techniques.•We expand our datasets by including additional queries related to surgical tools.•Extensive experiments prove the effectiveness and robustness of our solution.
Loading