LSE-CVCNet: A Generalized Stereoscopic Matching Network Based on Local Structural Entropy and Multi-Scale Fusion
Abstract: This study presents LSE-CVCNet, a novel stereo matching network designed to resolve challenges in dynamic scenes, including dynamic feature misalignment caused by texture variability and contextual ambiguity from occlusions. By integrating three key innovations—local structural entropy (LSE) to quantify structural uncertainty in disparity maps and guide adaptive attention, a cross-image attention mechanism (CIAM-T) to asymmetrically extract features from left/right images for improved feature alignment, and multi-resolution cost volume fusion (MRCV-F) to preserve fine-grained details through multi-scale fusion—LSE-CVCNet enhances disparity estimation accuracy and cross-domain generalization. The experimental results demonstrate robustness under varying lighting, occlusions, and complex geometries, outperforming state-of-the-art methods across multiple data sets. Ablation studies validate each module’s contribution, while cross-domain tests confirm generalization in unseen scenarios. This work establishes a new paradigm for adaptive stereo matching in dynamic environments.
External IDs:dblp:journals/entropy/YangZGHLZ25
Loading