- Paper length: 4 page
- Abstract: In this paper, we present a simple aggregation of frame-level CNN features in a face track to produce a track-level feature representation for face clustering in movies or videos. The approach is invariant of the image sequence and the number of frames the track has. We demonstrate the effectiveness of this strategy on three challenging benchmark video face clustering datasets: Big Bang Theory, Buffy the Vampire Slayer, and Notting Hill. Experiments using our straightforward strategy shows promising results on all the datasets. In addition, our strategy is useful in improving the baseline performance of generic face clustering methods without using any additional external constraints.
- TL;DR: A Simple and Effective Technique for Face Clustering in TV Series
- Keywords: Track description, face clustering
- Conflicts: kit.edu, kuleuven.be