Abstract: Self-supervised contrastive learning (CL) seeks to learn generalizable feature representations via the self-supervision of pairwise similarities, where existing CL approaches usually build definite similarity labels (e.g., positive or negative) for model training. Yet in practice, the same pair of instances may have opposite similarity labels in different scenarios, e.g., two interclass images from CIFAR-100 can be a similar pair in CIFAR-20. Learning with definite similarities can hardly obtain an ideal representation that simultaneously characterizes the similar and dissimilar patterns (e.g., the contexts and details) between each two instances. Therefore, pairwise similarities used for CL should be agnostic, and we argue that simultaneously considering both the similarity and dissimilarity for each data pair could learn more generalizable representations. To this end, we propose similarity-agnostic CL (SACL), which generalizes the instance discrimination strategy of conventional CL to a new multiobjective programming (MOP) form. In SACL, we build multiple projection layers with corresponding regularizers to constrain the distance matrix to have different sparsity in different objectives so that we can obtain alterable pairwise distances to capture both the similarity and dissimilarity between each pair of instances. We show that SACL can be equivalently converted to a single learning objective, easily solved by stochastic optimization with convergence guarantees. Theoretically, we prove a tighter error bound than conventional CL approaches; empirically, our method improves the downstream task performance for image, text, and graph data.
Loading