Abstract: Neural rendering has advanced significantly in 3D reconstruction and novel view synthesis, and integrating physics into these frameworks opens new applications such as physically accurate digital twins for robotics and XR.
However, the inverse problem of estimating physical parameters from visual observations remains challenging.
Existing physics-aware neural rendering methods typically require dense multi-view videos, making them impractical for scalable, real-world deployment.
Under sparse-view settings, the sequential optimization strategies employed by current approaches suffer from severe error accumulation: inaccuracies in initial 3D reconstruction propagate to subsequent stages, degrading physical state and material parameter estimates.
On the other hand, simultaneous optimization of all parameters fails due to the highly non-convex and often non-differentiable nature of the problem.
We propose ProJo4D, a progressive joint optimization framework that gradually expands the set of jointly optimized parameters. This design enables physics-informed gradients to refine geometry while avoiding the instability of direct joint optimization over all parameters.
Evaluations on synthetic and real-world datasets demonstrate that ProJo4D substantially outperforms prior work in 4D future state prediction and physical parameter estimation, achieving up to 10$\times$ improvement in geometric accuracy while maintaining computational efficiency.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Chuan-Sheng_Foo1
Submission Number: 7182
Loading