A Multimodality Framework for Creating Speaker/Non-Speaker Profile Databases for Real-World Video

Jehanzeb Abbas, Charlie K. Dagli, Thomas S. Huang

2007 (modified: 10 Nov 2022)CVPR 2007Readers: Everyone

Abstract: We propose a complete solution to full modality person-profiling for speakers and submodality person-profiling for non-speakers in real-world videos. This is a step towards building an elaborate database efface, name and voice correspondence for speakers appearing in the news videos. In addition we are also interested in only name and face correspondence database for non-speakers who appear during voice-overs. We use an unsupervised technique for creating a speaker identification database and a unique primary feature matching and parallel line matching algorithm for creating a non-speaker identification database. We tested our approach on real world data and the results show good performance for news videos. It can be incorporated as part of a larger multimedia news video analysis system or a multimedia search system for efficient news video retrieval and browsing.

0 Replies