Abstract: Consumer-level omnidirectional video offers an economically viable means to create virtual reality (VR) assets, enabling users to explore and interact within a fully immersive visual environment. However, editing such videos, particularly those with 360${}^{\circ }$ views and dynamic objects, poses significant challenges. Existing approaches to representing and manipulating omnidirectional content—whether designed for typical 2D perspective imagery or panoramas—often fail to adequately capture the complex spatiotemporal relationships crucial for producing high-quality, editable outputs in dynamic, panoramic settings. To overcome these challenges, we introduce OmniPlane, a novel method that leverages spherical spatiotemporal feature grids to empower the representation and editability of real-world dynamic omnidirectional environments casually captured by commodity omnidirectional cameras. OmniPlane computes spatiotemporal features by fusing vectors or matrices from each learnable spatial and spatiotemporal feature plane within a spherical coordinate system, complemented by a specifically designed weighted sampling strategy respecting the inherent spherical distribution of omnidirectional content. These learned feature planes can be flexibly decomposed into palette-based color bases. This innovative method not only enhances the representation capability of omnidirectional content and dynamics but also enables the recoloring of omnidirectional videos. Extensive experiments and a dedicated user study validate the superior performance of our proposed method in facilitating recolorable representations of dynamic omnidirectional environments.
External IDs:dblp:journals/tvcg/KouZNKD25
Loading