ActivityMamba: A CNN-Mamba Hybrid Neural Network for Efficient Human Activity Recognition

Published: 01 Jan 2025, Last Modified: 15 Oct 2025IEEE Trans. Mob. Comput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Current research in human activity recognition primarily emphasizes enhancing accuracy, with limited exploration into computational efficiency and hardware compatibility. Recently, Mamba has sparked substantial interest within the realm of deep learning. Mamba is a hardware-aware algorithm enabling very efficient training and inference. Researchers are applying Mamba to various tasks, demonstrating significant promise in both language and vision tasks. It is worthwhile to investigate the use of Mamba for efficient human activity recognition. In this paper, we proposed a hybrid neural network that integrates CNN and visual Mamba, called ActivityMamba. The SE-Mamba block in ActivityMamba utilizes both CNN’s local and Mamba’s global context modeling while keeping computation and memory efficiency. We evaluated the ActivityMamba on five public benchmark datasets collected by using three different sensing techniques. ActivityMamba achieved higher performance than vision transformers, vision Mamba, and CNNs with fewer FLOPs and parameters. It sets a new SOTA on all five datasets, which are 91.78% OA and 89.13% F1 on the USC-HAD dataset, 99.19% OA and 98.64% F1 on the UT-HAR dataset, 99.82% OA and F1 on the DIAT dataset, 98.59% OA and 98.65% F1 on the UCI-HAR dataset, and 95.41% OA and 93.14% F1 on the UniMib dataset. Our work is the first to investigate the CNN-Mamba hybrid network for efficient human activity recognition.
Loading