Efficient Spatiotemporal-Structural Masking for Dynamic Human Activity Recognition With Optimized Computation
Abstract: Recently, deep convolutional neural networks (CNNs) have achieved outstanding success in sensor-based human activity recognition (HAR) scenario, but at the cost of huge computational complexity, thereby restricting their practical deployment on resource-limited wearable devices. This may be partly attributed to static nature of most existing CNNs, which process all activity samples uniformly, resulting in structural and data redundancy. Comparing to static networks, one promising strategy is to accelerate activity inference by exploiting structural redundancy within deep CNNs, which selectively activates computation units such as convolution channels while handling different samples. The other promising strategy is to explore spatiotemporal redundancy by concentrating computational effort on the most informative regions of sensor data. How to simultaneously leverage structural and data redundancy still remains largely overlooked. In this article, from a new perspective of exploring both structural and spatiotemporal redundancy, we introduce an efficient spatiotemporal-structural masker network (SSMNet) for activity recognition. It utilizes a dual-mask mechanism to make dynamic, sample-specific decisions, thereby accelerating activity inference. The spatiotemporal-structural masker integrates spatiotemporal and structural decisions through masks, dynamically allocating computational resources based on input with minimal overhead. Extensive experiments on three public HAR benchmark datasets, namely, WISDM, UniMiB-SHAR, and PAMAP2. SSMNet is guided by a high-accuracy static model, allowing it to reduce computational costs while maintaining state-of-the-art performance. For example, comparing to static baselines, it may reduce nearly 40% FLOPs with an accuracy drop smaller than 1%, across all three datasets The detailed analyses affirm that our method can strike an optimal tradeoff between accuracy and efficiency.
Loading