Self-Supervised Motion Segmentation with Confidence-Aware Loss Functions for Handling Occluded Pixels and Uncertain Optical Flow Predictions

Chung-Yu Chen, Bo-Yun Lai, Ying-Shiuan Huang, Wen-Chieh Lin, Chieh-Chih Wang

Published: 01 Jan 2024, Last Modified: 18 May 2025IROS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In driving scenarios, motion segmentation is a crucial and fundamental component that is needed for many tasks. Recently, a self-supervised multitasking framework was proposed for driving scenarios. It simultaneously trains motion segmentation, optical flow, depth, and ego-motion models without annotated data. The self-supervised architecture derives training signals from training data via loss functions. If these loss functions lack robustness, they may result in model inaccuracies. To reduce the bad influences of occlusion and optical flow estimation errors on motion segmentation, we propose two loss functions: (1) Soft-Per-Pixel-Minimum (Soft-PPM) loss that excludes occluded pixels while balancing the contribution of each frame on the loss function temporally; (2) Flow difference loss that excludes pixels with unclear motion states to diminish the effect of optical flow estimation errors. Our loss function design is based on the key insight that information such as depth and optical flow can be used to train motion segmentation models and act as a reliable measure for pixels during training. Our approach can improve segmentation accuracy for both moving and static objects and has achieved IoU scores on moving and static classes comparable to the state-of-the-art methods on the KITTI dataset.