Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval

TMLR Paper2324 Authors

03 Mar 2024 (modified: 18 Mar 2024)Under review for TMLREveryoneRevisionsBibTeX
Abstract: Active Learning (AL) is a user-interactive approach aimed at reducing annotation costs by selecting the most crucial examples to label. Although AL has been extensively studied for image classification tasks, the specific scenario of interactive image retrieval has received relatively little attention. This scenario presents unique characteristics, including an open-set and class-imbalanced binary classification, starting with very few labeled samples. We introduce a novel batch-mode Active Learning framework named GAL (Greedy Active Learning) that better copes with this application. It incorporates a new acquisition function for sample selection that measures the impact of each unlabeled sample on the classifier. We further embed this strategy in a greedy selection approach, better exploiting the samples within each batch. We evaluate our framework with both linear (SVM) and non-linear MLP/Gaussian Process classifiers. For the Gaussian Process case, we show a theoretical guarantee on the greedy approximation. Finally, we assess our performance for the interactive content-based image retrieval task on several benchmarks and demonstrate its superiority over existing approaches and common baselines.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=OEee0WzVXd&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: Dear Editor, We are writing to submit a major revision of our manuscript titled "Active Learning via Classifier Impact and Greedy Selection for Interactive Image Retrieval" which was previously under review in TMLR 1428. We appreciate the feedback provided by the reviewers and the editorial team, and we have carefully addressed each comment, following the encouragement by the Action Editor, to make the paper ready for publication. In this revision, we have made significant enhancements to various aspects of the manuscript based on the feedback received during the review process. Specifically, we have revised: 1. Added a high level summary of the method: We have included an additional section (Sec 3: Algorithm Overview and Motivation), which presents the method at a high level accompanied by a toy example that presents diversity and uncertainty in a separate way. This example demonstrates the algorithm's distinct characteristics compared to several baselines. Specifically, we focus on the algorithm's attributes concerning uncertainty and diversity as requested in the last review. 2. Added further runtime analysis and complexity discussion: Following our conducted theoretical complexity analysis, we further present CPU runtimes while comparing to other methods. We illustrate runtime comparing to other methods, and show that although the complexity/runtime may be higher than several baseline approaches, we gain an accuracy boost while running in a reasonable time (See Sec. 5.2.1, page 18 & Fig. 17). This is achieved by our capability to process a smaller candidate set, reducing computation and reaching a higher accuracy level. In particular, for our Gaussian process method, we add runtime measures of the compared study, showing our advantage also in this respect. In the paper, we emphasize that our method is inherently parallel and can be significantly accelerated when implemented on a multi-core/GPU device. 3. Improved the clarity and organization: We have revised the manuscript to enhance its overall clarity, coherence, and organization. This includes restructuring certain sections, revising the language for clarity and conciseness, and ensuring consistency in terminology and formatting throughout the manuscript. 4. SVM margin issue: We affirm that this concern does not actually arise. Our toy example further validates this claim. We particularly illustrate in Sec. 3 Fig. 2d the selection pattern of our method, clearly showing that most samples are chosen outside the SVM margin zone. We further report the associated scores in this respect according to the review request. We argue that the scenario where our method might select only points within the margin would arise only if the classifier is nearly accurate, with seldom errors (or none) occurring outside the margin zone. Since this is not the case during the cold-start phase and the classifier is likely to be very inaccurate (as evidenced in our toy example), we contend that this scenario is a particular case and is not dictated into our model. We are confident that these revisions have considerably enhanced the quality and rigor of our manuscript , effectively addressing the concerns raised by the reviewers. This revised version, in our belief, represents a substantial improvement over the original submission and will make a valuable contribution to the academic literature in our field, as already recognized by the last review.
Assigned Action Editor: ~ERIC_EATON1
Submission Number: 2324
Loading