High-Performance Self-Supervised Learning by Joint Training of Flow Matching

Published: 03 Feb 2026, Last Modified: 15 Apr 2026AISTATS 2026 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: This flow matching model proposes the representation conditional (not class conditional) pretrained model. This mechanism is intended for providing representation for classification as well as other tasks.
Abstract: Diffusion models can learn rich representations during data generation, showing potential for Self-Supervised Learning (SSL), but they face a trade-off between generative quality and discriminative performance. Their iterative sampling also incurs substantial computational and energy costs, hindering industrial and edge AI applications. To address these issues, we propose the Flow Matching-based Sensor Foundation Model (SenFlow), which jointly trains a representation encoder and a conditional flow matching generator. This decoupled design achieves both high-fidelity generation and effective recognition. By using flow matching to learn a simpler velocity field, SenFlow accelerates and stabilizes training, improving its efficiency for representation learning. Experiments on wearable sensor data show SenFlow reduces training time by 50.4% compared to a diffusion-based approach. On downstream tasks, SenFlow surpassed the state-of-the-art SSL method on all five datasets while achieving up to a 51.0x inference speedup and maintaining high generative quality. The implementation code is available at https://github.com/Okita-Laboratory/SenFlow.
Submission Number: 1992
Loading