Abstract: A long-standing objective in human-AI interaction is to create personalized AI coaching systems that enhance human skill without tainting quantifiable behavioral patterns. We hypothesize that the common problem of style drift in AI coaching results from a failure to recognize the underlying causal structure, namely the collision between skill and behavioral patterns. We propose a methodological testbed for formalizing, quantifying, and addressing skill-behavioral pattern disentanglement under a particular causal structure. Instead of concentrating on holistic chess style, we specifically target a tractable proxy problem: decoupling skill from six interpretable aggregate play statistics. Our contribution is positioned as methodological rather than a comprehensive solution to chess coaching because this simplified feature space allows controlled testing of the collider hypothesis with known ground truth. We evaluate our approach on 30,000 real-world chess games, demonstrating that unsupervised disentanglement models ($\beta$-VAE, InfoGAN) fail on our testbed (MIG $\approx$ 0), while our causally informed architecture achieves strong disentanglement (MIG = 0.89, HSIC $\approx$ 0.00016). Our model produces statistically independent latent representations while maintaining excellent predictive accuracy. While we achieve statistical disentanglement on our defined features, we cannot validate whether the learned representations capture meaningful strategic concepts or enable effective coaching without human evaluation by chess domain experts. Our contribution demonstrates the statistical mechanism by which collider bias prevents disentanglement and how HSIC regularization addresses it.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Yutian_Chen1
Submission Number: 7231
Loading