Abstract: This paper studies the task of SatStreet-view synthesis, which aims to render photorealistic street-view panorama images and videos given a satellite image and specified camera positions or trajectories. Our approach involves learning a satellite image conditioned neural radiance field from paired images captured from both satellite and street viewpoints, which comes to be a challenging learning problem due to the sparse-view nature and the extremely large viewpoint changes between satellite and street-view images. We tackle the challenges based on a task-specific observation that street-view specific elements, including the sky and illumination effects, are only visible in street-view panoramas, and present a novel approach, Sat2Density++, to accomplish the goal of photo-realistic street-view panorama rendering by modeling these street-view specific elements in neural networks. In the experiments, our method is evaluated on both urban and suburban scene datasets, demonstrating that Sat2Density++ is capable of rendering photorealistic street-view panoramas that are consistent across multiple views and faithful to the satellite image. Project page is available at https://qianmingduowan.github.io/sat2density-pp/.
External IDs:doi:10.1109/tpami.2026.3652860
Loading