Abstract: Deep learning models achieve high accuracy in the semantic segmentation of 3D point clouds. However, it is challenging to discern which patterns a model has learned and how it derives its output from the input. In order to explain semantic segmentation models for 3D point clouds, saliency techniques can be applied to them. These methods generate saliency maps that visualize the contribution of input points to a particular model output. However, there is a lack of quantitative evaluation regarding the reliability of the generated saliency maps and the influence of different hyperparameters of the saliency techniques on the methods’ results. In this paper, we quantitatively evaluate the reliability of saliency maps generated by Integrated Gradients and pGS-CAM. To this end, we adapt established sanity checks from the image domain to 3D point cloud semantic segmentation. Our results suggest partial sensitivity to the parameters of the model and training labels for both saliency techniques. Furthermore, Integrated Gradients shows instability concerning the choice of baseline. Using ablation tests that gradually remove points with high or low attribution values, we show that pGS-CAM can reliably identify points with high and low contributions to the model output and that the performance of Integrated Gradients is heavily dependent on the baseline choice. Finally, we propose an averaging approach to aggregate the results of points that receive multiple saliency scores during the segmentation process and show that it produces saliency maps that better reflect high-contribution input points than previous approaches used for Integrated Gradients.
Loading