Adaptively forget with crossmodal and textual distillation for class-incremental video captioning

Published: 01 Jan 2025, Last Modified: 05 Mar 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We are the first to solve catastrophic forgetting in video captioning of no playback.•Modules are designed per visual encoding, textual decoding, and network supervision.•Experiments on MSR-VTT, MSVD and VATEX show that the proposed method is effective.
Loading