Abstract: Recently, augmented reality and wearable devices, such as smart eyewear systems, have gained significant attention due to advancements in computer vision technology and the proliferation of compact wearable cameras. This has led to an increased interest in egocentric vision, which offers a unique perspective for recognizing human actions and understanding behavior from a first-person view. However, existing approaches for egocentric action recognition often rely on complex architectures with high computational demands, such as large transformers, which are unsuitable for real-time applications on wearable devices with limited processing power. This work aims to develop a lightweight, real-time egocentric action recognition system tailored for resource-constrained environments. We evaluate the recent LaViLa model for online adaptation and explore the use of the lightweight MiniROAD model, initially designed for exocentric Online Action Detection, on egocentric data. By creating a focused dataset, EgoClip Office, we can optimize the model for our specific application. Our approach is validated on an Nvidia Jetson platform, demonstrating the feasibility of achieving real-time performance on low-power embedded devices.
External IDs:dblp:conf/eccv/SantambrogioCCPMTM24
Loading