Keywords: skilled human activity understanding, weakly supervised learning, action quality assessment, long-form video understanding
TL;DR: We present a novel approach to understanding skilled human activity using binary proficiency labels and introduce the Sparse Skill Extractor, which learns interpretable representations predictive of skill proficiency.
Abstract: Understanding skilled human activity is crucial in fields such as sports analytics, medical training, and professional development, where assessing proficiency can directly influence performance and outcomes. However, many existing approaches rely on human-annotated numerical scores or rankings, which are not only time-consuming but also introduce subjectivity. Conversely, categorizing proficiency as either high or low, though providing less detailed information, is easier to collect and can often be derived from group characteristics such as the distinction between novices and experts in surgical training. This new setting challenges models to uncover intrinsic patterns that reflect proficiency based solely on these weak labels. To achieve this, we introduce Sparse Skill Extractor, a multi-scale contrastive learning framework. It enforces both local and global feature comparisons between groups while pruning irrelevant video segments to highlight key moments of skilled or unskilled performance. Our results demonstrate that Sparse Skill Extractor not only delivers strong performance in predicting demonstrator proficiency but also enhances interpretability by facilitating the detection of non-proficient timestamps for low proficiency demonstrations.
Supplementary Material: pdf
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2252
Loading