Abstract: Cloud gaming (also referred to as Game Streaming) is a rapidly emerging application that is changing the way people enjoy video games. However, if the user demands a high-resolution (e.g., 2 K or 4 K) stream, the game frames require high bandwidth and the stream often suffers from a significant number of frame drops due to network congestion degrading the Quality of Experience (QoE). Recently, the DNN-based Super Resolution (SR) technique has gained prominence as a practical alternative for streaming low-resolution frames and upscaling them at the client for enhanced video quality. However, performing such DNN-based tasks on resource-constrained and battery-operated mobile platforms is very expensive and also fails to meet the real-time requirement (60 frames per second (FPS)). Unlike traditional video streaming, where the frames can be downloaded and buffered, and then upscaled by their playback turn, Game Streaming is real-time and interactive, where the frames are generated on the fly and cannot tolerate high latency/lags for frame upscaling. Thus, state-of-the-art (SOTA) DNN-based SR cannot satisfy the mobile Game Streaming requirements. Towards this, we propose GameStreamSR, a framework for enabling real-time Super Resolution for Game Streaming applications on mobile platforms. We take visual perception nature into consideration and propose to only apply DNN-based SR to the regions with high visual importance and upscale the remaining regions using traditional solutions such as bilinear interpolation. Especially, we leverage the depth data from the game rendering pipeline to intelligently localize the important regions, called regions of importance (RoI), in the rendered game frames. Our evaluation of ten popular games on commodity mobile platforms shows that our proposal can enable realtime (60 FPS) neurally-augmented SR. Our design achieves a $13 \times$ frame rate speedup (and $\approx 4 \times$ Motion-to-Photon latency improvement) for the reference frames and a $1.6 \times$ frame rate speedup for the non-reference frames, which translates to, on average $2 \times$ FPS performance improvement and 26-33% energy savings over the SOTA DNN-based SR execution, while achieving about 2dB PSNR gain and better perceptual quality than the current SOTA.
Loading