Label-Efficient Online Continual Object Detection in Streaming VideoDownload PDF

22 Sept 2022 (modified: 12 Mar 2024)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Online Continual Learning, Object Detection, Complementary Learning Systems, Streaming Video
Abstract: To thrive in evolving environments, humans are capable of continual acquisition and transfer of new knowledge, from a continuous video stream, with minimal supervision, while retaining previously learnt experiences. In contrast to human learning, most standard continual learning (CL) benchmarks focus on learning from static i.i.d. images that all have labels for training. Here, we examine a more realistic and challenging problem—Label-Efficient Online Continual Object Detection (LEOCOD) in streaming video. By addressing this problem, it would greatly benefit many real-world applications (e.g., personalized robots, augmented/virtual reality headsets, etc.) with reduced data annotation costs and model retraining time. To tackle this problem, we seek inspirations from complementary learning systems (CLS) in human brains and propose Efficient-CLS, a plug-and-play module that can be easily inserted into and improve existing continual learners. On two challenging CL benchmarks for streaming real-world videos, we integrate Efficient-CLS into state-of-the-art CL algorithms, and achieve significant improvement with minimal forgetting across all supervision levels. Remarkably, with only 25% annotated video frames, our Efficient-CLS still outperforms the base CL learners, which are trained with 100% annotations on all video frames. We will make source code publicly available upon publication.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
TL;DR: Towards label-efficient online continual object detection in video streams, our Efficient-CLS only uses 25% annotation costs while it still outperforms the best baseline.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2206.00309/code)
5 Replies

Loading