Using open surgery simulation kinematic data for tool and gesture recognition

Adam Goldbraikh, Tomer Volk, Carla M. Pugh, Shlomi Laufer

2022 (modified: 09 Nov 2022)Int. J. Comput. Assist. Radiol. Surg. 2022Readers: Everyone

Abstract: Purpose The use of motion sensors is emerging as a means for measuring surgical performance. Motion sensors are typically used for calculating performance metrics and assessing skill. The aim of this study was to identify surgical gestures and tools used during an open surgery suturing simulation based on motion sensor data. Methods Twenty-five participants performed a suturing task on a variable tissue simulator. Electromagnetic motion sensors were used to measure their performance. The current study compares GRU and LSTM networks, which are known to perform well on other kinematic datasets, as well as MS-TCN++, which was developed for video data and was adapted in this work for motion sensors data. Finally, we extended all architectures for multi-tasking. Results In the gesture recognition task the MS-TCN++ has the highest performance with accuracy of $$82.4 \pm 6.97$$ 82.4 ± 6.97 and F1-Macro of $$78.92 \pm 8.5$$ 78.92 ± 8.5 , edit distance of $$86.30 \pm 8.42$$ 86.30 ± 8.42 and F1@10 of $$89.30 \pm 7.01$$ 89.30 ± 7.01 In the tool usage recognition task for the right hand, MS-TCN++ performs the best in most metrics with an accuracy score of $$94.69 \pm 3.57$$ 94.69 ± 3.57 , F1-Macro of $$86.06 \pm 7.06$$ 86.06 ± 7.06 , F1@10 of $$84.34 \pm 10.90$$ 84.34 ± 10.90 , and F1@25 of $$80.58 \pm 12.03$$ 80.58 ± 12.03 . The multi-task GRU performs best in all metrics in the left-hand case, with an accuracy of $$95.04 \pm 4.18$$ 95.04 ± 4.18 , edit distance of $$85.01 \pm 16.94$$ 85.01 ± 16.94 , F1-Macro of $$89.81 \pm 11.65$$ 89.81 ± 11.65 , F1@10 of $$89.17 \pm 13.28$$ 89.17 ± 13.28 , and F1@25 of $$88.64 \pm 13.6$$ 88.64 ± 13.6 . Conclusion In this study, using motion sensor data, we automatically identified the surgical gestures and the tools used during an open surgery suturing simulation. Our methods may be used for computing more detailed performance metrics and assisting in automatic workflow analysis. MS-TCN++ performed better in gesture recognition as well as right-hand tool recognition, while the multi-task GRU provided better results in the left-hand case. It should be noted that our multi-task GRU network is significantly smaller and has achieved competitive results in the rest of the tasks as well.

0 Replies