Dynamic Switching Teacher: How to Generalize Temporal Action Detection Models

Fangming Feng; Sihang Cai; Zirun Guo; Weicai Yan; Yangyang Wu; Zehan Wang; Xize Cheng; Tao Jin

Dynamic Switching Teacher: How to Generalize Temporal Action Detection Models

Fangming Feng, Sihang Cai, Zirun Guo, Weicai Yan, Yangyang Wu, Zehan Wang, Xize Cheng, Tao Jin

25 Sept 2024 (modified: 13 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Video Understanding, Temporal Action Detection, Domain Adaptation

TL;DR: This study proposes a source-free domain adaptation method for temporal action detection in videos using dynamic switching teacher.

Abstract: Temporal Action Detection (TAD) is a crucial task in video understanding, focusing on the precise identification of the onset and termination of specific actions within video sequences. Despite advancements on certain datasets, existing methods often struggle to maintain their efficacy when applied to datasets from disparate domain. In this study, we introduce, for the first time, the application of source-free domain adaptation (SFDA) techniques to the field of TAD, aiming to enhance the generalization capability of TAD models on unlabeled target datasets without access to source data. Most popular SFDA methods predominantly follow the Mean-Teacher (MT) framework and often falter due to the significant domain shift. The generation of pseudo labels by a pre-trained teacher model on the source domain can lead to a cascade of errors when these labels guide the training of a student model, potentially causing a harmful TAD feedback loop. To address this issue, we propose a novel dynamic switching teacher strategy that integrates both dynamic and static teacher models. The dynamic teacher model updates its parameters by learning knowledge from the student model. Concurrently, the static teacher model engages in periodic weight exchange with the student model, ensuring baseline performance and maintaining the quality of pseudo labels. This approach significantly mitigates the label noise. We establish the first benchmark for SFDA in TAD tasks and conduct extensive experiments across various datasets. Our method demonstrates state-of-the-art performance, substantiating the suitability of our method for TAD.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4063

Loading