PCA-UNET for Object Segmentation

Cheng Long, Sayantika Nag, Adrian Barbu

Published: 01 Jan 2024, Last Modified: 17 Sept 2025ICIP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper introduces a PCA-based shape model for representing object shapes, and uses it as a decoder together with a CNN for instance and semantic segmentation. As opposed to standard PCA-based shape models that need point correspondences, the proposed model represents shapes as binary images of a certain size and performs PCA on the binary vectors of the training shapes. To obtain a well trained shape model, standard data augmentation techniques are applied, and an online PCA approach is introduced to be able to deal with the large number of training examples. The PCA shape model can be computed on a grid of object patches instead of the entire object. Experiments on PascalVOC reveal that this grid-based PCA has good generalization to model even shapes of novel object types. This PCA shape model is then incorporated as a decoder together with a ResNet CNN and is trained end-to-end for instance and semantic segmentation. Experiments on two popular datasets reveal that the proposed model outperforms many existing state-of-the art segmentation methods such as Masked Autoencoder, Mask-RCNN, DeepLabV3 and U-Net, while being very simple and interpretable.