Video Scene Interpretation Using Perceptual Prominence and Mise-en-scène Features

Gaurav Harit, Santanu Chaudhury

Published: 2006, Last Modified: 20 May 2025ACCV (2) 2006EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We propose an empirical computational model for generating an interpretation of a video shot based on our proposed principle of perceptual prominence. The principle of perceptual prominence captures the key aspects of mise-en-scène required for interpreting a video scene. We present a novel approach for applying perceptual grouping principles to the spatio-temporal domain of video. Our spatio-temporal perceptual grouping scheme, applied on blob tracks, makes use of a specified spatio-temporal coherence model. A high level semantic interpretation of scenes is done using the mise-en-scène features and the perceptual prominence computed for the perceptual clusters.