Learning Golf Swing Key Events from Gaussian Soft Labels Using Multi-Scale Temporal MLPFormer

Published: 01 Jan 2023, Last Modified: 24 May 2025IJCNN 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: A complete golf swing includes several key events. The standardization of poses in each key event is directly related to the hitting effect. Thus, it is meaningful for the players to analyze their poses, especially at key frames, so as to improve swing performances. With the rapid development of deep learning techniques in computer vision, we are able to detect key frames during a golf swing. In this paper, we propose a framework to recognize key events in golf swing based on pure monocular video data. To achieve this, we have combined attention mechanism in the backbone network to extract concise features and leveraged the transformer structure to fuse multi-scale temporal information to enhance the feature representation. Besides, we also introduce Gaussian kernels into the label generation process, which can effectively solve the problem of ambiguity in detecting key events within their neighbouring similar frames. Notably, our method achieves an average recognition accuracy of 83.4% (+7.3% compared with SwingNet) for eight golf swing events on GoIfDB dataset.
Loading