Abstract: Bokeh is a wide-aperture optical effect that creates aesthetic blurring in photography. However, achieving this effect typically demands expensive professional equipment and expertise. To make such cinematic techniques more accessible, bokeh rendering aims to generate the desired bokeh effects from all-in-focus inputs captured by smartphones. Previous efforts in bokeh rendering primarily focus on static images. However, when extended to video inputs, these methods exhibit flicker and artifacts due to a lack of temporal consistency modeling. Meanwhile, they cannot utilize information like occluded objects from adjacent frames, which are necessary for bokeh rendering. Moreover, the difficulties of capturing all-in-focus and bokeh video pairs result in a shortage of data for training video bokeh models. To tackle these challenges, we propose the Video Bokeh Renderer (VBR), the model designed specifically for video bokeh rendering. VBR leverages implicit feature space alignment and aggregation to model temporal consistency and exploit complementary information from adjacent frames. On the data front, we introduce the first Synthetic Video Bokeh (SVB) dataset, synthesizing authentic bokeh effects using ray-tracing techniques. Furthermore, to improve the robustness of the model to inaccurate disparity maps, we employ a set of augmentation strategies to simulate corrupted disparity inputs during training. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our method. Code and dataset will be released.
Primary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: Bokeh, a term denoting the aesthetic blur in out-of-focus areas, is widely utilized across various visual mediums such as movie production, animation, game rendering, and more. However, achieving this effect typically necessitates expensive DSLR cameras and expertise. Although many efforts have been made in single-image based bokeh rendering, video bokeh rendering has not been explored well, which is important due to the significant role of video in our daily lives. Our work aims to make bokeh effects more accessible to casual videographers by enabling the rendering of refocusable videos from initially all-in-focus footage. In this work, we introduce a novel model, the Video Bokeh Renderer (VBR), for generating aesthetic bokeh effects from all-in-focus videos captured by smartphones. Notably, our model excels in maintaining temporal consistency throughout the video sequence and effectively mitigating artifacts at the edges by harnessing information from neighboring frames. Additionally, we also propose the first Synthetic Video Bokeh Dataset (SVBD), addressing data shortages in this field and enabling further research and development. Overall, this work first considers temporal information in video bokeh rendering, establishing a solid foundation for future advancements in multimedia processing applications.
Supplementary Material: zip
Submission Number: 1299
Loading