Abstract: In this paper, we present ResNet-based vehicle classification and localization methods using real traffic surveillance recordings. We utilize a MIOvision traffic dataset, which comprises 11 categories including a variety of vehicles, such as bicycle, bus, car, motorcycle, and so on. To improve the classification performance, we exploit a technique called joint fine-tuning (JF). In addition, we propose a dropping CNN (DropCNN) method to create a synergy effect with the JF. For the localization, we implement basic concepts of state-of-the-art region based detector combined with a backbone convolutional feature extractor using 50 and 101 layers of residual networks and ensemble them into a single model. Finally, we achieved the highest accuracy in both classification and localization tasks using the dataset among several state-of-the-art methods, including VGG16, AlexNet, and ResNet50 for the classification, and YOLO Faster R-CNN, and SSD for the localization reported on the website.
0 Replies
Loading