EMG-JEPA: Towards Scalable and Generalizable sEMG-Based Hand Pose Estimation via Self-Supervised Learning

EMG-JEPA: Towards Scalable and Generalizable sEMG-Based Hand Pose Estimation via Self-Supervised Learning

TMLR Paper7101 Authors

22 Jan 2026 (modified: 15 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: This work introduces EMG-JEPA, a Joint Embedding Predictive Architecture (JEPA) designed to improve generalization for hand pose estimation from surface electromyography (sEMG) signals. Collecting labeled sEMG data for hand pose estimation is costly, as it requires synchronizing the sEMG recordings with motion capture systems to obtain precise joint-angle annotations. To mitigate the dependency on such expensive labels, EMG-JEPA uses self-supervised learning to derive transferable representations from unlabeled sEMG signals, which can then be fine-tuned for downstream hand pose estimation. We analyze the effectiveness of EMG-JEPA on data collected from three wrist-worn devices, providing signals with 8, 16, and 110 channels. Our results show that EMG-JEPA can improve cross-user hand pose estimation, particularly in high-channel-density settings, reducing joint-angle error by up to 3.55% and 5.13% for the 16- and 110-channel setups, respectively. Further, results from the 8-channel setup suggest a channel-density threshold (≈16 channels), below which JEPA-based pretraining offers limited gains. Overall, our study identifies key design choices for developing a JEPA for sEMG, offering a scalable approach to reduce labeled data requirements.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Xiaofeng_Cao1

Submission Number: 7101

Loading