Positive and Negative Set Designs in Contrastive Feature Learning for Temporal Action Segmentation

Yi-Chen Chen, Wei-Ta Chu

Published: 01 Jan 2024, Last Modified: 04 Nov 2025IEEE Transactions on Circuits and Systems for Video TechnologyEveryoneRevisionsCC BY-SA 4.0

Abstract: When data labels are scarce, contrastive learning is often used to learn representations in a weakly-supervised or unsupervised way. In contrastive learning, not only the learning mechanism, but also the designs of positive and negative sets are critical. While most previous works of Temporal Action Segmentation (TAS) focus on designing new segmentation methods, we investigate the importance of positive and negative set designs in contrastive learning and verify that better representations can be learned to enhance performance of existing TAS methods. Specific to timestamp-supervised TAS and unsupervised TAS, respectively, we propose positive/negative set designs, associated with the ideas of ambiguous frames and the set expansion process to make learned representations more effective. In the evaluation, we demonstrate that performance of timestamp-supervised TAS can be boosted by 8% to 15% in terms of F1@10 across three different datasets, and the performance of unsupervised TAS can be boosted by 3% to 5% in terms of F1 scores, achieving new state-of-the-art TAS results.

External IDs:doi:10.1109/tcsvt.2024.3417392