Evaluation of deep pose detectors for automatic analysis of film style.

07 Apr 2022OpenReview Archive Direct UploadReaders: Everyone
Abstract: Identifying human characters – how they are portrayed, the actions they carry out, and their interactions in the scene – is central to understanding plot and assessing artistic value in films, and are inherently linked to how we perceive and interpret visual media. Building computational models that have sensibility towards story will thus require a formal representation of the character. Yet this kind of data is complex and tedious to annotate at a frame level and at a large scale. Human pose estimation (HPE) may be the key to facilitating this task, to identify features such as position, size, and movement that can be transformed into input to machine learning models, and afford higher artistic and storytelling interpretation. However, current HPE methods operate mainly on non-professional image content, with no comprehensive evaluation of their performance on artistic film. In this work, we first propose a formal representation of the character based on cinematography theory. We design tools to annotate a sampled subset of 2500 images from three datasets with this representation, one of which we introduce to the community. An in-depth analysis is then conducted to measure the general performance of two recent HPE methods on metrics of precision and recall, and to examine the impact of cinematographic style. From these findings, we highlight the advantages of HPE for automated film analysis, and propose future directions to improve their performance on artistic film content.
0 Replies

Loading