Keywords: Point Cloud Forecasting, Flow Matching, Diffusion
Abstract: Predicting the evolution of complex 3D scenes is fundamental to safe autonomous navigation. While LiDAR provides the necessary geometric precision, existing forecasting methods often struggle with the inherent multimodality of dynamic environments. Deterministic models typically collapse to the conditional mean, resulting in "ghosting" artifacts and non-physical spatial interpolations that fail to represent discrete future contingencies.
In this paper, we introduce Li-AutoFlow, a generative framework that unifies Autoregressive Sequence Modeling with Flow Matching to produce high-fidelity, multimodal LiDAR forecasts. By operating directly on the continuous 2D range image manifold, our approach bypasses the limitations of discrete tokenization and lossy codebook bottlenecks found in prior generative works. Our formulation leverages straight-line probability flow ODEs to transport a noise prior into physically plausible future scenes, ensuring temporal causality through an autoregressive conditioning scheme.
Crucially, our tokenizer-free architecture maintains a fully differentiable pipeline from the flow-matching objective to the final reconstructed 3D point cloud. This enables direct optimization with respect to 3D geometric metrics, such as the Chamfer distance, ensuring centimeter-level precision. Experimental results on KiTTI demonstrate that Li-AutoFlowsignificantly outperforms state-of-the-art deterministic and discrete-diffusion baselines, providing the coherent, disjoint scene samples necessary for robust motion planning in uncertain environments.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 11
Loading