Abstract: Incremental Object Detection (IOD) simulates the dynamic data flow in real-world applications, which require detectors to learn new classes or adapt to domain shifts while retaining knowledge from previous tasks. Most existing IOD methods focus only on class incremental learning, assuming all data comes from the same domain. However, this is hardly achievable in practical applications, as images collected under different conditions often exhibit completely different characteristics, such as lighting, weather, style, etc.
Class IOD methods suffer from severe performance degradation in these scenarios with domain shifts. To bridge domain shifts and category gaps in IOD, we propose Purified Distillation (PD), where we use a set of trainable queries to transfer the teacher's attention on old tasks to the student and adopt the gradient reversal layer to guide the student to learn the teacher's feature space structure from a micro perspective. This strategy further explores the features extracted by the teacher during incremental learning, which has not been extensively studied in previous works. Meanwhile, PD combines classification confidence with localization confidence to purify the most meaningful output nodes, so that the student model inherits a more comprehensive teacher knowledge.
Extensive experiments across various IOD settings on six widely used datasets show that PD significantly outperforms state-of-the-art methods. Even after five steps of incremental learning, our method can preserve 60.6\% mAP on the first task, while compared methods can only maintain up to 55.9\%.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Content] Vision and Language
Relevance To Conference: Object detection plays a crucial role in the multimedia domain as it aids in identifying and localizing specific objects in images or videos, thereby providing fundamental support for various applications such as video surveillance, intelligent transportation, virtual reality, etc. However, with the changes in the environment, multimedia data continuously updates, which necessitates the use of incremental learning techniques to maintain model flexibility and adaptability. Therefore, this work focuses on incremental object detection with both category gaps and domain shifts, and proposes Purified Distillation (PD) to mitigate catastrophic forgetting in such scenarios, thereby contributing an effective method for maintaining previous memory in dynamic data flows within the multimedia processing.
Supplementary Material: zip
Submission Number: 2279
Loading