Abstract: Even though CCTV cameras are widely deployed for
traffic surveillance and have therefore the potential of becoming cheap automated sensors for traffic speed analysis,
their large-scale usage toward this goal has not been reported yet. A key difficulty lies in fact in the camera calibration phase. Existing state-of-the-art methods perform the
calibration using image processing or keypoint detection
techniques that require high-quality video streams, yet typical CCTV footage is low-resolution and noisy. As a result,
these methods largely fail in real-world conditions. In contrast, we propose two novel calibration techniques whose
only inputs come from an off-the-shelf object detector. Both
methods consider multiple detections jointly, leveraging the
fact that cars have similar and well-known 3D shapes with
normalized dimensions. The first one is based on minimizing an energy function corresponding to a 3D reprojection
error, the second one instead learns from synthetic training
data to predict the scene geometry directly. Noticing the
lack of speed estimation benchmarks faithfully reflecting the
actual quality of surveillance cameras, we introduce a novel
dataset collected from public CCTV streams. Experimental results conducted on three diverse benchmarks demonstrate excellent speed estimation accuracy that could enable
the wide use of CCTV cameras for traffic analysis, even in
challenging conditions where state-of-the-art methods completely fail. Additional information can be found on our
project web page: https://rebrand.ly/nle-cctv
0 Replies
Loading