Deployment and Explanation of Deep Models for Endoscopy Video Classification
Abstract: Deep neural networks achieve state-of-the-art performance in several standard datasets across a variety of domains. While many pre-trained models on such standard datasets are available off-the-shelf, practitioners aiming to deploy deep learning based solutions to a specific real world application face two important challenges. Firstly, the standard models cannot be readily used and need significant reconfiguration and downstream training to suit the specific application. Secondly, in many critical applications involving humans, along with the model, it is also important to provide a mechanism to explain its decisions. In this paper, we address these challenges in the context of deploying deep models for endscopy video analysis. Our contribution is (i) a decoupled CNN-Transformer for classifying intubation procedures, and (ii) a mechanism that explains the model's decisions. The CNN-Transformer performs better than the baseline off-the-shelf-model with downstream training, and our explanations show that the CNN-Transformer model uses the right spatial and temporal features to arrive the final classification.
Article: pdf
2 Replies
Loading