Abstract: In minimally invasive surgeries, such as endoscopic and ophthalmic procedures, specular highlights on tissue and instrument surfaces can obscure critical details, compromising surgical safety and precision. Traditional methods rely on color segmentation and filtering optimization but are highly sensitive to lighting variations and produce suboptimal restoration. While deep learning enhances detection robustness, its effectiveness is constrained by the scarcity of annotated medical data and unnatural boundary transitions in restored regions. To address these challenges, this paper proposes a two-stage hierarchical network framework. First, a Hierarchical Feature Attention Network (HFA-Net) is designed, integrating spatial-shift segmented attention (S\(^{2}\)MLP), dual-flow attention (DFA), multi-scale feature fusion (SFF), and partial mask convolution (PMConv) to achieve precise detection and removal of specular highlights. Second, a large-mask inpainting model (LaMa) is introduced, utilizing dilated mask expansion to enhance contextual awareness and improve texture consistency in the restored regions. To address the scarcity of medical highlight datasets, we construct four specialized datasets covering various surgical scenarios, including ophthalmic injections and instrument reflections, while also incorporating publicly available data to enhance model generalization. Experimental results demonstrate that the proposed method outperforms existing approaches across six datasets in terms of detection accuracy and restoration quality, particularly excelling in complex textures and natural boundary transitions. Our code is available at https://github.com/tkllndxn/highlight-removal.
External IDs:dblp:conf/miccai/LiCHGWZTLH25
Loading