3D motion in visual saliency modeling

Pengfei Wan, Yunlong Feng, Gene Cheung, Ivan V. Bajic, Oscar C. Au, Yusheng Ji

Published: 2013, Last Modified: 17 May 2023ICASSP 2013Readers: Everyone

Abstract: Visual saliency is a probabilistic estimate of how likely a given spatial area in an image or video is to attract human visual attention relative to other areas. Bottom-up saliency models aggregate low-level image features like luminance and color contrast, flicker, 2D motion, etc. to construct a plausible saliency map. In this paper, we introduce 3D motion (object movements towards or away from the observer) into bottom-up video saliency modeling. Given availability of per-pixel depth maps, we first propose a novel algorithm to estimate 3D motion vectors (3DMVs) for arbitrarily shaped sub-blocks in texture-plus-depth videos. We then derive two feature channels from 3DMVs to be incorporated into a widely accepted bottom-up saliency model. Experiments on subjective quality of Region-of-Interest (ROI) based video coding show that our enriched saliency model with 3DMV channels is more accurate in estimating human visual attention.

0 Replies