Feature Representation Transferring to Lightweight Models via Perception Coherence

TMLR Paper6136 Authors

07 Oct 2025 (modified: 18 Oct 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this paper, we propose a method for transferring feature representation to lightweight student models from larger teacher models. We mathematically define a new notion called perception coherence. Based on this notion, we propose a loss function, which takes into account the dissimilarities between data points in feature space through their ranking. At a high level, by minimizing this loss function, the student model learns to mimic how the teacher model perceives inputs. More precisely, our method is motivated by the fact that the representational capacity of the student model is weaker than the teacher model. Hence, we aim to develop a conceptually new method allowing for a better relaxation. This means that, the student model does not need to preserve the absolute geometry of the teacher one, while preserving global coherence through dissimilarity ranking. Importantly, while rankings are defined only on finite sets, our notion of perception coherence extends them into a probabilistic form. This formulation depends on the input distribution and applies to general dissimilarity metrics. Our theoretical insights provide a probabilistic perspective on the process of feature representation transfer. Our experimental results show that our method outperforms or achieves on-par performance with strong baseline methods for representation transfer, particularly class-unaware ones.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Mingming_Gong1
Submission Number: 6136
Loading