From On-Field Actions to Internal States: A Latent Variable Framework for Analyzing Athlete Performance

03 Sept 2025 (modified: 08 Oct 2025)Submitted to Agents4ScienceEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Scoring Dynamics, Hidden Markov Model, HMM-GLM, Sports
Abstract: Traditional sports analytics relies on independence assumptions that fail to capture temporal dependencies and streak phenomena in athletic performance. We propose a Hidden Markov Model-Generalized Linear Model (HMM-GLM) framework for modeling latent performance states, positing that observable fluctuations emerge from underlying persistent states rather than direct event causation.We systematically evaluate the framework across three professional sports leagues using play-by-play data from MLB, NBA, and NHL. The HMM models unobservable state transitions while the GLM uses inferred states for outcome prediction, with sport-specific adaptations for context-aware transitions and class imbalance handling. Results demonstrate substantial improvements over baseline models in baseball and basketball, with significant AUC gains and positive delta log-likelihood indicating effective capture of temporal dependencies. The learned states exhibit meaningful performance differentiation and moderate persistence, providing statistical support for the ``hot hand'' phenomenon. However, hockey applications showed limited effectiveness, revealing critical boundary conditions. Our analysis identifies class balance and event structure as fundamental determinants of success. Sports with moderate outcome rates facilitate effective state learning, while extreme imbalance impedes latent structure identification. Cross-domain analysis reveals sport-specific dynamics with limited generalization across leagues. These findings provide the first systematic validation of latent performance states in professional sports and establish guidelines for sequential modeling in athletic contexts. The framework challenges traditional independence assumptions and offers practical tools for performance evaluation and strategic decision-making, with implications extending to broader sequential modeling applications.
Supplementary Material: zip
Submission Number: 75
Loading