Mixture-of-Gaussian Evidential Learning for Uncertainty-Aware Stereo Matching

Weide Liu; MINGRUI LI; wang xingxing; Lu Wang; Jun Cheng; Fayao Liu; Liang Cao; Xulei Yang

Mixture-of-Gaussian Evidential Learning for Uncertainty-Aware Stereo Matching

Weide Liu, MINGRUI LI, wang xingxing, Lu Wang, Jun Cheng, Fayao Liu, Liang Cao, Xulei Yang

04 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Stereo Matching

TL;DR: We propose an evidential Gaussian-mixture framework for stereo matching that better captures multimodal depth uncertainties, achieving state-of-the-art performance across in-domain and cross-domain benchmarks.

Abstract: Stereo matching remains a challenging task due to the presence of uncertainties in real-world data, particularly in textureless, occluded, or reflective regions. While existing methods incorporate uncertainty estimation into stereo matching for better performance, they typically assume a pixel-wise unimodal Gaussian distribution. However, the depth distributions in real-world scenarios are rarely unimodal, making the single-Gaussian assumption inadequate for modeling their heteroscedastic and multimodal characteristics. We address this limitation with a new evidential learning framework that models each pixel with a Gaussian mixture distribution. Each mixture component is regularized by an inverse-Gamma prior, and the network predicts pseudo-posterior mixture probabilities, enabling principled per-component uncertainty estimation. We evaluate our method on stereo matching by training on the Scene Flow dataset and testing on KITTI 2015 and Middlebury 2014. Experimental results consistently show that our approach outperforms baseline methods and achieves new state-of-the-art performance on both in-domain and cross-domain benchmarks, demonstrating the robustness and effectiveness of the proposed framework. The code will be publicly released upon completion of the review process.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 1901

Loading