ALStereo: Active learning for stereo matching

Published: 01 Jan 2025, Last Modified: 22 Jul 2025Pattern Recognit. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: With advancements in deep stereo matching, recent networks have achieved impressive accuracy in estimating depth information from image pairs. However, stereo matching networks require sufficient disparity labels, which always come at high annotation costs. In this paper, we propose the ALStereo framework for training stereo matching networks under limited labeling budgets, which selects informative samples for manual labeling and conducts semi-supervised learning to propagate the knowledge to unlabeled samples. Specifically, we embed image pairs as nodes in a graph representation, where edges denote the similarity in terms of stereo matching challenges. Based on the graph representation, we divide the labeling budget into two parts for conducting representativeness-based and uncertainty-based strategies, balancing the selection of the most representative and challenging samples. To fully exploit the labeled samples to train networks, we propose a two-stage semi-supervised training pipeline, where the first stage mitigates the domain shifts and the second stage propagates the knowledge of manually annotated samples to unlabeled samples. We set the first benchmark for evaluating training stereo matching networks under limited labeling budgets and demonstrate our method significantly outperforms the compared methods. We also provide analysis to demonstrate our graph representation effectively models the similarity between samples in terms of stereo matching challenges.
Loading