Abstract: We present a detection method that is able to detect a learned target and is valid for both static and moving cameras. As an application, we detect pedestrians, but could be anything if there is a large set of images of it. The data set is fed into a number of deep convolutional networks, and then, two of these models are set in cascade in order to filter the cutouts of a multi-resolution window that scans the frames in a video sequence. We demonstrate that the excellent performance of deep convolutional networks is very difficult to match when dealing with real problems, and yet we obtain competitive results.
Loading