Abstract: Effective streaming feature selection in dynamic on-line environments is essential in numerous applications. However, most existing methods evaluate high-dimensional features individually and ignore the potentially pertainable group structures of features. Moreover, the class imbalance underlying streaming data may further decrease the discriminative efficacy of the selected features, resulting in deteriorated classification performance. Motivated by this observation, we propose a proximal cost-sensitive sparse group online learning (PCSGOL) framework to handle imbalanced and high-dimensional streaming data. Specifically, we formulate this issue as a new cost-sensitive online optimization problem by leveraging the ℓ<inf>2</inf>-norm, ℓ<inf>1</inf>-norm, and group-wise sparsity constraints in the dual averaging regularization. The average weighted distance is also introduced in PCSGOL to achieve stable prediction results. We mathematically derive closed-form solutions to the optimization problems with four modified hinge loss functions, leading to four variants of PCSGOL. Extensive empirical studies on real-world streaming datasets demonstrate the effectiveness of our proposed method.
Loading