Keywords: Smooth Zoom, Frame Interpolation, 3D Gaussian Splatting, Velocity Disambiguity, Cross-Device Dataset
TL;DR: We propose OmniZoom, a universal plug-and-play framework for cross-device smooth zoom interpolation.
Abstract: Dual-camera smartphones suffer from geometric and photometric inconsistencies during zoom transitions, primarily due to disparities in intrinsic/extrinsic parameters and divergent image processing pipelines between the two cameras. Existing interpolation methods struggle to effectively address this issue, constrained by the lack of ground-truth datasets and motion ambiguity in dynamic scenarios.
To overcome these challenges, we propose OmniZoom, a universal plug-and-play paradigm for cross-device smooth zoom interpolation.
Specifically, we present a novel cross-device virtual data generation method utilizing 3D Gaussian Splatting. This method tackles data scarcity by decoupling geometric features via spatial transition modeling and correcting photometric variations with dynamic color adaptation. It is further enhanced by cross-domain consistency learning for device-agnostic semantic alignment. Additionally, we introduce a plug-and-play 3D-TPR (3D Trajectory Progress Ratio Mapping) framework that surmounts 2D spatial limitations. As components of our framework, a texture-focus strategy is introduced for high-frequency detail preservation, incorporating mask penalty constraints to suppress interpolation artifacts. Our pipeline exhibits broad compatibility with diverse interpolation methods and achieves good performance across multiple public benchmarks. Real-world evaluations on various smartphone platforms also reveal significant quality improvements after fine-tuning on our synthetic data, which underscores the robustness and practical effectiveness of our approach for cross-device zoom applications.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 2027
Loading