Fight Detection in Video Sequences Based on Multi-Stream Convolutional Neural Networks

Sarah Almeida Carneiro, Gabriel Pellegrino da Silva, Silvio Jamil Ferzoli Guimarães, Hélio Pedrini

Published: 2019, Last Modified: 15 May 2025SIBGRAPI 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Surveillance has been gradually correlating itself to forensic computer technologies. The use of machine learning techniques made possible the better interpretation of human actions, as well as faster identification of anomalous event outbursts. There are many studies regarding this field of expertise. The best results reported in the literature are from works related to deep learning approaches. Therefore, this study aimed to use a deep learning model based on a multi-stream and high level hand-crafted descriptors to be able to address the issue of fight detection in videos. In this work, we focused on the use of a multi-stream of VGG-16 networks and the investigation of conceivable feature descriptors of a video's spatial, temporal, rhythmic and depth information. We validated our method in two commonly used datasets, aimed at fight detection, throughout the literature. Experimentation has demonstrated that the association of correlated information with a multi-stream strategy increased the classification of our deep learning approach, hence, the use of complementary features can yield interesting outputs that are superior than other previous studies.