Abstract: One-shot object detection (OSOD) uses a query patch to identify the same category of object in a target image. As the OSOD setting, the target images are required to contain the object category of the query patch, and the image styles (domains) of the query patch and target images are always similar. However, in practical application, the above requirements are not commonly satisfied.
Therefore, we propose a new problem namely Cross-Domain Object Search (CDOS), where the object categories of the query patch and target image are decoupled, and the image styles between them may also be significantly different. For this problem, we develop a new method, which incorporates both foreground-background contrastive learning heads and a domain-generalized feature augmentation technique. This makes our method effectively handle the object category gap and domain distribution gap, between the query patch and target image in the training and testing datasets. We further build a new benchmark for the proposed CDOS problem, on which our method shows significant performance improvements over the comparison methods.
Primary Subject Area: [Engagement] Multimedia Search and Recommendation
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: In this work, we study the problem of Cross-Domain Object Search (CDOS). This problem can be regarded as a multimedia retrieval problem, which is a typical topic in multimedia community.
Specifically, based on the previous one-shot object detection, this work makes the following two extensions. First, we expand the search space from the specific target images into the query-free retrieval gallery. Second, we expand the image styles in the gallery from natural image into variety of styles, e.g., art style, cartoon style, etc.
We believe that the studied problem is more aligned with the real-world requirement of multimedia retrieval, which has many practical multimedia applications, e.g., retrieval and browsing in Web search, multimedia based artwork retrieval, etc.
Supplementary Material: zip
Submission Number: 1445
Loading