Abstract: This paper addresses an important and valuable open-world object detection (OWOD) in autonomous driving scenarios, which aims to detect objects under both domain-agnostic and category-agnostic settings simultaneously. Existing OWOD algorithms mainly focus on the detection of pre-defined object categories under various conditions (domain-agnostic) or instead perform zero-shot object detection (category-agnostic), separately. The knowledge gap between seen and unseen object categories poses challenges for models optimized with supervision from the only seen object categories. The domain difference across different scenarios also causes further challenges in aligning observations with different appearances. To address these two challenges simultaneously, we propose our Instance Dictionary Learning (IDL for short) for more robust and accurate OWOD performance. We first design a pre-training procedure to build up the mappings between region features and category semantic embeddings by introducing instance contrastive learning. The joint vision-semantic space is formulated through the more detailed instance-level “Dictionary”, which expresses the region-category correspondences and helps link the seen and unseen object categories. The domain discrimination is further designed for extracting the domain invariance feature representations in the further training procedure seamlessly. The proposed IDL could detect the unseen categories from unseen domains without any bounding box annotations while there is no obvious performance drop on detecting seen categories meanwhile. Comprehensive experiments have been conducted and our method could achieve a new state-of-the-art OWOD performance over previous algorithms.
Loading