Toward Using Multi-Modal Machine Learning for User Behavior Prediction in Simulated Smart Home for Extended Reality

Published: 2022, Last Modified: 22 Jan 2026VR Workshops 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this work, we propose a multi-modal approach to manipulate smart home devices in a smart home environment simulated in virtual reality (VR). We determine the user's target device and the desired action by their utterance, spatial information (gestures, positions, etc.), or a combination of the two. Since the information contained in the user's utterance and the spatial information can be disjoint or complementary to each other, we process the two sources of information in parallel using our array of machine learning models. We use ensemble modeling to aggregate the results of these models and enhance the quality of our final prediction results. We present our preliminary architecture, models, and findings.
Loading