Cross-domain Few-shot Classification via Maximization Optimized Kernel Dependence

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: cross-domain, few-shot classification, hilbert-schmidt independent criterion, computer vision, deep learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In cross-domain few-shot classification, \emph{nearest centroid classifier} (NCC) aims to learn representations to construct a metric space where few-shot classification can be performed by measuring the similarities between samples and the prototype of each class. An intuition behind NCC is that each sample is pulled closer to the class centroid it belongs to while pushed away from other classes. However, in this paper, we find that there exist high similarities between NCC-learned representations of two samples from different classes. These undesirable high similarities may induce uncertainty and further result in incorrect classification of samples. In order to solve this problem, we propose a bi-level optimization framework, \emph{maximizing optimized kernel dependence} (MOKD), to learn better similarities (dependence) among samples so that similarities among samples belonging to the same class are maximized while similarities between samples from different classes are minimized. Specifically, MOKD first optimizes the kernel \emph{Hilbert-Schmidt Independence Criterion} (HSIC) by maximizing its test power to obtain a powerful kernel dependence measure, the optimized kernel HSIC (opt-HSIC). Then, an optimization problem w.r.t. the opt-HSIC is solved to maximize the similarities among samples belonging to the same class while minimizing the similarities among all samples simultaneously. Since kernel HSIC with large test power is sensitive to dependence, it can precisely measure the dependence among representations of samples. Extensive experiments on the popular benchmark Meta-Dataset show that MOKD can achieve \emph{state-of-the-art} generalization performance on unseen domains under most task settings and is able to learn better data clusters.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2357
Loading