Abstract: Abnormality detection helps human beings by reducing the amount of data to be processed manually. However, detection and localization of contextual abnormality in image and video sequence have to deal with many challenges. Some object which is normal in one scenario may be considered as abnormal in another. The general solution is to divide the frame into regions or patches, followed by abnormality detection. The performance of the patch-based approach is limited to the size of the context window and suffers from issues of limited field-of-view. It does not consider the information available in the entire frame at a time. Increasing the patch size requires more number-of-nodes to be present in the network, and hence more computation memory is demanded. It also requires significant trainable parameters to train the system. Decreasing the node will reduce the performance and size of the context it can capture. These issues are overcome in the proposed method. The framework combines the convolution neural network and adversarial autoencoder for the localization of contextual abnormality. The spatial arrangement between objects across the different channels in the feature map of CNN is jointly trained from the normal data. The developed framework is further extended to reduce the required trainable parameters, which otherwise becomes a computational challenge. Experimental result outperforms the baseline approach in terms of localizing contextual abnormality.
Loading