Abstract: One of the principal challenges in developing robust Machine Learning (ML) classification algorithms for Human Activity Recognition (HAR) from real-time smart home sensor data is how to account for variations in 1) the activity sequence length, 2) the contribution each sensor has to an activity, and 3) the amount of activity class imbalance. Such changes generate observations that do not conform to expected patterns potentially reducing the efficacy of classification models. Moreover the architecture of prior solutions have been quite complex which have resulted in large training times for these approaches to achieve acceptable classification accuracy. In this paper we address these three issues by 1) proposing a data structure representing the duration and frequency information of each sensor for an activity, 2) transforming this data structure into an Information Retrieval (IR)-based representation, and finally 3) compare and contrast the utility of this IR-based representation using four different supervised classifiers. Our proposed framework in combination with a state-of-the-art ensemble learner results in more accurate and scalable ML classification models that are better suited toward off-line HAR in a smart home setting.
Loading