Domain adaptation with optimized feature distribution for streamer action recognition in live video

Chen He, Jing Zhang, Lin Chen, Hui Zhang, Li Zhuo

Published: 01 Jan 2025, Last Modified: 12 Apr 2025Int. J. Mach. Learn. Cybern. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Since the large-scale annotation of streamer actions is expensive, training with generic action data is a practical approach. Nevertheless, the spatiotemporal differences between generic actions and streamer actions decrease the recognition accuracy. Domain adaptation utilizes labeled data from both the source domain and target domain to mitigate the performance degradation of target domain data, but it relies on (1) the feature distribution of each category that satisfies the clustering assumption and (2) the distribution of features of the same category in different domains having minimal discrepancy. Considering that streamer action recognition in live video does not meet the above assumptions, we propose a domain adaptation method with optimized feature distribution for streamer action recognition in live video. The method generates diverse features for each sample through the style transfer module and then uses the proposed metric learning loss to constrain the features in a similar feature space to satisfy the above assumptions. The experimental results show that our method has an accuracy of 86.35%, which exceeds the SOTA by 4.71% and an inference speed of 1500 FPS, which is capable of performing the task of streamer action recognition in live video.