Abstract: This paper presents a multi-perspective convolutional neural network (CNN) that extracts the class of objects supporting intelligent transportation systems. The proposed model is the visual geometry group (VGG) backbone network with custom feature extraction blocks, that use multilayer prediction heads. The model addresses both multi-class and multi-object classification tasks utilizing the automotive object detection dataset. The model is designed with multiple prediction heads to classify different objects in an image and enable object count prediction. A publicly available automotive object detection dataset with 19800 images and labels has been utilized. The dataset consists of five primary types of objects: Persons, Trucks, Motorbikes, Cars, and Cyclists. On the dataset, pre-trained models such as VGG, Resenet, EfficientNet, and DenseNet were tested and their classification performance was evaluated. The experimental results illustrate the superiority of the proposed VGG backbone deep learning CNN model in comparison to other pre-trained models.
Loading