Abstract: The ability to recognize and classify 3D object models into discrete categories is becoming more and more important. It has applications in a wide variety of fields ranging from augmented reality to robotics and self-driving cars. In this paper, we present a novel approach to 3D Object Recognition that aims to transform 3D objects into 2D representations that preserve depth as well as possible. The transformed models are then fed into Vision Transformer-based neural networks that are able to classify them into 40 discrete categories. For the training, we use Princeton's ModelNet40 data set which contains 12, 311 pre-aligned models labeled into 40 categories. We believe our approach is important as it allows us to use recent breakthroughs in the task of 2D classification, such as Vision Transformers, for the task of 3D classification.
Loading