Abstract: Road accidents are often caused by short abnormal
events, including traffic violations, abrupt change in vehicular
motion, driver fatigue, etc. Observing an accident event from
the right camera perspective plays a crucial role while detecting
accidents. However, it may not be possible to capture such
abnormal events from a limited camera perspective. We present a
deep learning framework to analyze the accident events recorded
from multiple perspectives. First, we estimate feature similarity
in videos recorded from multiple perspectives. We then divided
the video samples into high and low feature similarity groups.
Next, we extract spatio-temporal features from each group
using two-branch DCNNs and fuse them using a rank-based
weighted average pooling strategy followed by classification.
We present a new road accident video dataset (MP-RAD), where
each accident event is synthetically generated and captured
from five independent camera perspectives using a computer
gaming platform. Most of the existing road accident datasets use
egocentric views or they are captured in fixed camera setups.
However, our dataset is large and multi-perspective that can be
used to validate ITS-related tasks such as accident detection,
accident localization, traffic monitoring, etc. The dataset contains
400 accident events with a total of 2000 videos. We provide
temporal annotations of all videos. The proposed framework
and the dataset have been cross-validated with latest accident
detection baselines trained on real-world road accident videos
and vice-versa. The sub-optimal detection accuracy obtained
using the baselines indicates that the proposed framework and the
dataset can be useful for ITS related research.
Loading