Unsupervised tractive momentum: a novel unsupervised few‑shot learning framework

Zhong Cao, Jiang Lu, He Liu, Yuheng Luo

Published: 27 Aug 2025, Last Modified: 05 May 2026The Journal of Supercomputing Volume 81, article number 1281, (2025)EveryoneCC BY 4.0

Abstract: Few-shot learning (FSL) aims at distilling transferable knowledge on existing concepts to cope with novel concepts for which only a few labeled data are available. Most of the popular FSL methods acquire this knowledge by learning on large-scale supervised data from the existing concepts. Considering obtaining supervised data might sometimes be difficult and heavy-burden, we pursue a relatively mild prerequisite for FSL, that is, using unsupervised instead of supervised data to acquire the transferable knowledge. We propose a novel easy-to-implement FSL framework, Unsupervised Tractive Momentum (UTM), composed of modular dual encoders, a combinatorial loss mechanism, and a classifier that together form a reusable and extensible learning system, that only requires unsupervised data of existing concepts. UTM randomly samples unsupervised data and augments them to create many synthetic query-key matching tasks on-the-fly, and deploys two different encoders while possessing identical architecture, named traction encoder and momentum encoder, to learn a representation space by a combinatorial parameter updating manner. The representation space learned on unsupervised data is expected to be a good fit to few-shot recognition on novel concepts. UTM is composed of parallelizable dual encoders and optimized for scalable training in GPU-based highperformance computing environments. Theoretical convergence and bound analysis further support its deployment in distributed systems. Theoretical justifications of the parameter updating mechanism in UTM are given from the perspective of convergence, and a theoretical loss bound for UTM is proved, which mathematically quantifies the relationship between our self-supervised UTM and the vanilla supervised method. Extensive experimental evaluation on several benchmark datasets demonstrates that UTM yields significant improvement to state-of-the-art unsupervised methods even very close to supervised methods, which can also be well explained using our theory.