SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms

Paper ID 3525

Capabilities

After training, SimULi can render novel views with arbitrary camera models, including fisheye lenses, as shown in our interactive viewer.
SimULi accurately models time-dependent effects, such as rolling shutter, as shown below:
As our LiDAR tiling scheme automatically finds optimal tiling parameters for arbitrary LiDAR models as described in Sec B of the appendix, we can easily render fully customized LiDAR models.

Representation

We compare our factorized representation and anchoring loss to a unified alternatives directly supervised with LiDAR depth or solely with camera losses. We render novel views on PandaSet below. Our approach outperforms these alternatives noticeably (and renders LiDAR 2x faster).

Waymo Interp Comparisons

We compare SimULi to prior work (FPS measured on a NVIDIA A100 GPU) on clips from the Waymo Interp benchmark. Our method renders 5-10x faster than the NeRF-based UniSim and NeuRAD baselines and at higher quality. Compared to 3DGS, we render at a similar speed (faster in cases where 3DGS optimization creates more particles) and improve quality, especially at the periphery where rolling shutter is a factor.
Here, SimULi renders the lettering of the street sign where other methods cannot - a capability that is critical for autonomous driving.

Waymo Dynamic Comparisons

We also evaluate our method to OmniRe, a dynamic 3DGS-based method that does not handle rolling shutter or LiDAR rasterization, on clips Waymo Dynamic benchmark. We render almost 4x faster and at higher quality.
As OmniRe cannot rasterize LiDAR rangemaps, it instead relies on projected depthmaps for supervision. Although Waymo LiDAR is closely synchronized with the cameras (which is not the case for all autonomous driving rigs), we still see projection errors that contribute to the halo around the gray vehicle to the right.

PandaSet Comparisons

Here, we further compare SimULi to SplatAD, a recent method that adds LiDAR rasterization and rolling shutter to 3DGS. On PandaSet, against which their method is chiefly evaluated, we render 35% faster (and >10x faster than NeRF-based methods) and at similar-to-better quality. In this nighttime capture, our learned bilateral grids handle lens flare slightly better than SplatAD's appearance embeddings (and avoid the "fog" artifacts rendered by NeRF methods).
Our factorized representation and rendering innovations allow us to render LiDAR at higher fidelity over 10x faster.