Abstract: In recent years, volumetric videos have gradually prospered as an intriguing video paradigm, offering users a fully immersive viewing experience with six Degrees of Freedom (DoF). However, most current live volumetric video streaming methods struggle to facilitate the real-time performance requirements due to the nature of frequent user interactions and the complexity of network environments during video playback. Inspired by the correlation between the human visual effects and adjacent frame motion features, we propose Cetus, a context-aware cross-layer coordination system for live volumetric videos. First, we present an application-layer Neural Radiance Fields (NeRF)-based codec framework that leverages spatio-temporal semantic information for optimizing the compression quality of each video frame. Second, we exploit a flexible cross-layer coordination framework that seamlessly integrates frame drop strategy with partially reliable transmission, orchestrating transport protocols and application-informed rates to enhance the Quality of Experience (QoE) for multiple users. Furthermore, we develop a lightweight branching decision tree algorithm that adaptively makes fine-grained frame drop decisions. Experimental evaluations of our implemented system prototype demonstrate that Cetus significantly outperforms existing baseline approaches. Compared to the state-of-the-art baselines, Cetus effectively improves video frame rate by at least 24.7% and video quality by an average of 32.6%.
External IDs:doi:10.1109/ton.2025.3638407
Loading