Multi-View Summarization and Activity Recognition Meet Edge Computing in IoT EnvironmentsDownload PDF

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone
  • Original Pdf: pdf
  • Keywords: Artificial Intelligence, Big Data, Convolutional Neural Network, Computational Intelligence, Computer Vision, IIoT, IoT, Video Summarization
  • TL;DR: An efficient multi-view video summarization scheme advanced to activity recognition in IoT environments.
  • Abstract: Multi-view video summarization (MVS) lacks researchers’ attention due to their major challenges of inter-view correlations and overlapping of cameras. Most of the prior MVS works are offline, relying on only summary, needing extra communication bandwidth and transmission time with no focus on uncertain environments. Different from the existing methods, we propose edge intelligence based MVS and spatio-temporal features based activity recognition for IoT environments. We segment the multi-view videos on each slave device over edge into shots using light-weight CNN object detection model and compute mutual information among them to generate summary. Our system does not rely on summary only but encode and transmit it to a master device with neural computing stick (NCS) for intelligently computing inter-view correlations and efficiently recognizing activities, thereby saving computation resources, communication bandwidth, and transmission time. Experiments report an increase of 0.4 in F-measure score on MVS Office dataset as well as 0.2% and 2% increase in activity recognition accuracy over UCF-50 and YouTube 11 datasets, respectively, with lower storage and transmission time compared to state-of-the-art. The time complexity is decreased from 1.23 to 0.45 secs for a single frame processing, thereby generating 0.75 secs faster MVS. Furthermore, we made a new dataset by synthetically adding fog to an MVS dataset to show the adaptability of our system for both certain and uncertain surveillance environments.
2 Replies