Semi-ViM: Bidirectional State Space Model for Mitigating Label Imbalance in Semi-Supervised Learning
Abstract: Semi-supervised learning (SSL) is often hindered by learning biases when imbalanced datasets are used for training, which limits its effectiveness in real-world applications. In this paper, we propose Semi-ViM, a novel SSL
framework based on Vision Mamba, a bidirectional state
space model (SSM) that serves as a superior alternative
to Transformer-based architectures for visual representation learning. Semi-ViM effectively deals with imbalanced
datasets and improves model stability through two key innovations: LyapEMA, a stability-aware parameter update
mechanism inspired by Lyapunov theory, and SSMixup, a
novel mixup strategy applied at the hidden state level of
bidirectional SSMs. Experimental results on ImageNet-1K
and ImageNet-LT demonstrate that Semi-ViM significantly
outperforms state-of-the-art SSL models, achieving 85.40%
accuracy with only 10% of the labeled data, surpassing
Transformer-based methods such as Semi-ViT
Loading