A General Multistage Deep Learning Framework for Sensor-Based Human Activity Recognition Under Bounded Computational Budget

Xing Wang, Lei Zhang, Dongzhou Cheng, Yin Tang, Shuoyuan Wang, Hao Wu, Aiguo Song

Published: 2024, Last Modified: 06 Aug 2025IEEE Trans. Instrum. Meas. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In recent years, sliding windows have been widely employed for sensor-based human activity recognition (HAR) due to their implementational simplicity. In this article, inspired by the fact that not all time intervals in a window are activity-relevant, we propose a novel multistage HAR framework named MS-HAR by implementing a sequential decision procedure to progressively process a sequence of relatively small intervals, i.e., reduced input, which is automatically cropped from the original window with reinforcement learning. Such a design naturally facilitates dynamic inference at runtime, which may be terminated at an arbitrary time once the network obtains sufficiently high confidence about its current prediction. Compared to most existing works that directly handle the whole window, our method allows for very precisely controlling the computational budget online by setting confidence thresholds, which forces the network to spend more computation on a “difficult” activity while spending less computation on an “easy” activity under a finite computational budget. Extensive experiments on four benchmark HAR datasets consisting of WISMD, PAMAP2, USC-HAD, and one weakly labeled dataset demonstrate that our method is considerably more flexible and efficient than the competitive baselines. Particularly, our proposed framework is general since it is compatible with most mainstream backbone networks.