Abstract: Video identification is an important task in the practical application and industry. Based on the iQIYI-VID-2019 dataset, ACM International Conference on Multimedia and iQIYI co-hosted the celebrity video identification challenge. We take part in the competition, propose a new feature fusion method and design a residual dense network which can improve video identification performance in the complex scenes. Only with face features, we achieve 0.9035 in mean Average Precision(mAP) which win the second place on the leadboard. At the same time, it is the best score only with official features. It is worth mention that the flops of our model is only 0.5G and the time required to predict the entire test dataset is only 2 sim 5 minutes. Our method takes accuracy and speed into account, which has a strong practical significance.
Loading