MonoPCFlow: Enabling Efficient Scene Flow Estimation From Monocular View

Chichao Cheng, Guangming Wang, Yin-Dong Zheng, Lu Liu, Hesheng Wang

Published: 01 Jan 2025, Last Modified: 17 Apr 2026IEEE Transactions on Instrumentation and MeasurementEveryoneRevisionsCC BY-SA 4.0

Abstract: Scene flow captures the dynamic changes of points in a 3-D scene, essential for understanding motion in physical environments. Light detection and ranging (LiDAR)-based scene flow estimation methods face challenges related to resolution, refresh rate, and cost. In contrast, monocular image-based methods estimate optical flow and depth separately at different stages. This fragmented approach inevitably compromises spatial–temporal consistency and introduces error accumulation. We propose monocular point cloud FlowNet (MonoPCFlow), a novel framework for scene flow estimation directly from a pair of consecutive monocular images. We integrate pseudo-LiDAR representations with dense 3-D scene flow estimation, effectively bridging the 2-D-to-3-D domain gap for monocular motion analysis. We develop a depth-enhanced refinement module that mitigates information loss in pseudo-LiDAR generation, preserving critical geometric and appearance features to improve scene flow accuracy. Experimental validation demonstrates MonoPCFlow’s superior performance, achieving 37.0% (FlyingThings3D) and 39.7% Karlsruhe Institute of Technology and Toyota Institute of Technology (KITTI) relative reductions in endpoint-error compared to contemporary benchmarks.

External IDs:doi:10.1109/tim.2025.3600732