Procedurally Generated Colonoscopy and Laparoscopy Data for Improved Model Training Performance

Thomas Dowrick, Long Chen, João Ramalhinho, Juana González-Bueno Puyal, Matthew J. Clarkson

Published: 2023, Last Modified: 05 Nov 2023DEMI@MICCAI 2023Readers: Everyone

Abstract: The use of synthetic/simulated data can greatly improve model training performance, especially in areas such as image guided surgery, where real training data can be difficult to obtain, or of limited size. Procedural generation of data allows for large datasets to be rapidly generated and automatically labelled, while also randomising relevant parameters within the simulation to provide a wide variation in models and textures used in the scene. A method for procedural generation of both textures and geometry for IGS data is presented, using Blender Shader Graphs and Geometry Nodes, with synthetic datasets used to pre-train models for polyp detection (YoloV7) and organ segmentation (UNet), with performance evaluated on open-source datasets. Pre-training models with synthetic data significantly improves both model performance and generalisability (i.e. performance when evaluated on other datasets). Mean DICE score across all models for liver segmentation increased by 15% (p=0.02) after pre-training on synthetic data. For polyp detection, Precision increased by 11% (p=0.002), Recall by 9% (p=0.01), mAP@.5 by 10% (p=0.01) and mAP@[.5:95] by 8% (p-0.003). All synthetic data, as well as examples of different Shader Graph/Geometry Node operations can be downloaded at https://doi.org/10.5522/04/23843904 .

0 Replies