Abstract: First-person-view (FPV) cameras are finding wide use in daily life to record activities and sports. In this paper, we propose a succinct and robust 3D convolutional neural network (CNN) architecture accompanied with an ensemble-learning network for activity recognition with FPV videos. The proposed 3D CNN is trained on low-resolution (32 × 32) sparse optical flows using FPV video datasets consisting of daily activities. According to the experimental results, our network achieves an average accuracy of 90%.
External IDs:dblp:conf/icpr/KaoLCCLH20
Loading