Multiview Representation Learning for Human Activity Recognition

Massinissa Hamidi; Aomar Osmani; Lukmon Rasaq; Gülüstan Dogan; Nouran Alotaibi

Multiview Representation Learning for Human Activity Recognition

Massinissa Hamidi, Aomar Osmani, Lukmon Rasaq, Gülüstan Dogan, Nouran Alotaibi

Published: 01 Jan 2022, Last Modified: 26 Jul 2025CIVEMSA 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Over the years, multi-view representation learning has become widespread in machine learning and deep learning due to the availability of extensive data and the contribution of great researchers from all over the world. Some works in activity recognition presume that the sensory data used to train the models are derived from the same body locations. These methods encounter challenges when the expected output user and the body location are unknown. However, to construct a model that can recognize human activity and transportation systems in a user-independent manner, we make use of Deep canonical correlation analysis (DCCA), which is a method that involves learning the representation of two views. It is a technique that helps multiple views of nonlinear correlated values to be linearly correlated. In addition, it can also be described as a surrogate to kernel canonical correlation analysis (KCCA). While KCCA works well with respect to the inner product, DCCA does not need the inner product. DCCA has many attributes, such as scaling time; it scales well with the training set. Moreover, we observed that the works that were done with an approach of DCCA performed better than jobs done by CCA and KCCA. This paper shows an implementation of representative learning: deep neural networks, deep canonical correlation analysis, and its several extensions. Our approach for this work is based on two parts: the reduction of impacts from each phone position and the recognition of the correct activity. Further, the architecture of our research is comprised of three different aspects: firstly, it allows the recognition of the input source; secondly, it enables the normalization of data to make training faster and also to stabilize the impact of the source on the activity learning process; and lastly, it allows the recognition of the correct activity.

Loading