Decoupling and Aggregating: Dual-Layer Light Field Depth Estimation With Reflective and Transparent Surfaces

Shuo Zhang, Yanlin Xie, Jiaxin Chen, Youfang Lin

Published: 2026, Last Modified: 27 Apr 2026IEEE Trans. Circuits Syst. Video Technol. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Light Field (LF) is extensively utilized for depth estimation tasks due to its rich structural information. However, real-world LF images often encounter reflective and transparent surfaces and the related regions contain depth information from the reflection and background layers, which can be modeled as dual-layer scenes. For the existing depth estimation frameworks, the constructed cost volume shows an aliasing bimodal distribution in dual-layer surfaces and further causes serious wrong depth results. In this paper, we propose a novel decoupling-and-aggregating strategy and develop a dual-layer depth estimation network for LF images with complex reflections. Specifically, we develop an adaptive cost volume decoupling module to separate both the background and reflection features from the aliasing cost volume. Light field angular-spatial information is sufficiently extracted to infer the effort of features in different dimensions to the background or reflection layer. Additionally, we employ an iterative self-guided aggregating module with multi-stage supervision to aggregate two branches of cost volumes. The module applies the self-guided masks to regularize the distribution of cost volumes. Given the challenge of acquiring the ground truth disparity maps for the LF images under reflection scenes, we also construct a synthetic dataset with dual-layer properties. Our model is the first to introduce dual-layer scenes into the LF depth estimation task using an end-to-end deep neural network. It successfully separates the background and reflection layers and achieves accurate depth estimation results in both layers. Quantitative and qualitative experiment results on publicly available datasets demonstrate that our method performs better than other state-of-the-art methods.