Abstract: This work proposes a state-of-the-art transformer architecture to detect and segment catheters in axial interventional Ultrasound image sequences. The network architecture was inspired by the Attention in Attention mechanism, temporal tracking networks, and introduced a novel 3D segmentation head that performs 3D deconvolution across time. To train the network, we introduce a new data synthesis pipeline that uses physics-based catheter insertion simulations, along with a convolutional ray-casting ultrasound simulator to produce synthetic ultrasound images of endovascular interventions. The proposed method is validated on a hold-out validation dataset, thus demonstrated robustness to ultrasound noise and a wide range of scanning angles. It was also tested on data collected from silicon aorta phantoms, thus demonstrated its potential for translation from sim-to-real. This work represents a significant step towards safer and more efficient endovascular surgery using interventional ultrasound.
External IDs:dblp:conf/icra/RanneVNB24
Loading